Grid Computing Lab

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 61

GRID COMPUTING LAB

Globus Toolkit

Introduction

 The open source Globus Toolkit is a fundamental enabling technology for the "Grid,"
letting people share computing power, databases, and other tools securely online across
corporate, institutional, and geographic boundaries without sacrificing local autonomy.
 The toolkit includes software services and libraries for resource monitoring, discovery,
and management, plus security and file management.
 In addition to being a central part of science and engineering projects that total nearly a
half-billion dollars internationally, the Globus Toolkit is a substrate on which leading IT
companies are building significant commercial Grid products.

Installation Procedure for Globus Toolkit


Mandatory prerequisite:
Linux 64 bit Operating System
Download all softwares
1. Apache-tomcat-7.0.67-tar.gz
2. Apache-ant-1.9.6.bin.tar.gz
3. Junit 3.8.1.zp
4. Jdk-8u60-linux-x64.gz
Copy downloads to /usr/local and type the following commands

1. cp /home/stack/downloads/* /usr/local
2. pwd
3. tar zxvf jdk-8u60-linux-x64.gz

cd jdk1.8.0_60/

pwd

export JAVA_HOME=/usr/local/grid/SOFTWARE/jdk1.8.0_60/bin
cd ..
4. tar zxvf apache-ant-1.9.6-bin.tar.gz

pwd
export ANT_HOME=/usr/local/grid/SOFTWARE/apache-ant-1.9.6
cd ..

5. tar zxvf apache-tomcat-7.0.67.tar.gz

cd apache-tomcat-7.0.67/
pwd
export CATALINA_HOME=/usr/local/grid/SOFTWARE/apache-tomcat-7.0.67

cd ..

6. unzip junit3.8.1.zip

cd junit3.8.1
pwd
export JUNIT_HOME=/usr/local/grid/SOFTWARE/junit3.8.1

cd ..
pwd

7. dpkg -i globus-toolkit-repo_latest_all.deb

8. apt-get update

9. apt get install globus-data-management-client

apt get install globus-gridftp


apt get install globus-gram5
apt get install globus-gsi
apt get install myproxy
apt get install myproxy-server
apt get install myproxy-admin

10. grid-cert-info -subject

11. grid-mapfile-add-entry {-------output of grid-cert-info -subject-----} gtuser

grid-proxy-init -verify -debug

12. service globus-gridftp-server start

service globus-gridftp-server status

13. myproxy-logon -s {name}

14. service globus-gatekeeper start

service globus-gatekeeper statusmyproxy-logon –s

15. globus-job-run name /bin/hostname


Ex. No:1 Develop new web service for calculator
Date :

Aim:
To Develop new web service for calculator using Globus toolkit.
Procedure :
When you start Globus toolkit container, there will be number of services starts up.
The service for this task will be a simple Math service that can perform basic arithmetic for a
client.
The Math service will access a resource with two properties:
1. An integer value that can be operated upon by the service
2. A string values that holds string describing the last operation

The service itself will have three remotely accessible operations that operate upon value:
(a) add, that adds a to the resource property value.
(b) subtract that subtracts a from the resource property value.
(c) getValueRP that returns the current value of value.
Usually, the best way for any programming task is to begin with an overall description of what
you want the code to do, which in this case is the service interface. The service interface
describes how what the service provides in terms of names of operations, their arguments and
return values.

A Java interface for our service is:

public interface
Math
{ public void add(int a);
public void subtract(int a);
public int getValueRP();
}
It is possible to start with this interface and create the necessary WSDL file using the
standard Web service tool called Java2WSDL. However, the WSDL file for GT 4 has to include
details of resource properties that are not given explicitly in the interface above.
Hence, we will provide the WSDL file. Step 1 Getting the Files All the required files are
provided and comes directly from [1]. The MathService source code files can be found from
http://www.gt4book.com (http://www.gt4book.com/downloads/gt4book-examples.tar.gz).
A Windows zip compressed version can be found at
http://www.cs.uncc.edu/~abw/ITCS4146S07/gt4book-examples.zip. Download and uncompress
the file into a directory called GT4services. Everything is included (the java source WSDL and
deployment files, etc.)
WSDL service interface description file -- The WSDL service interface description file is
provided within the GT4services folder at:
GT4Services\schema\examples\MathService_instance\Math.wsdl This file, and discussion of its
contents, can be found in Appendix A. Later on we will need to modify this file, but first we will
use the existing contents that describe the Math service above.
Service code in Java -- For this assignment, both the code for service operations and for the
resource properties are put in the same class for convenience. More complex services and
resources would be defined in separate classes.

The Java code for the service and its resource properties is located within the GT4services folder
at:
GT4services\org\globus\examples\services\core\first\impl\MathService.java.
Deployment Descriptor -- The deployment descriptor gives several different important
sets of information about the service once it is deployed. It is located within the GT4services
folder at: GT4services\org\globus\examples\services\core\first\deploy-server.wsdd. Step 2 –
Building the Math Service It is now necessary to package all the required files into a GAR (Grid
Archive) file.
The build tool ant from the Apache Software Foundation is used to achieve this as shown
overleaf:

Generating a GAR file with Ant (from http://gdp.globus.org/gt4-


tutorial/multiplehtml/ch03s04.html) Ant is similar in concept to the Unix make tool but a java
tool and XML based. Build scripts are provided by Globus 4 to use the ant build file. The
windows version of the build script for MathService is the Python file called globus-build-
service.py, which held in the GT4services directory. The build script takes one argument, the
name of your service that you want to deploy. To keep with the naming convention in [1], this
service will be called first.
In the Client Window, runs the build script from the GT4services directory with: globus-
build-service.py first The output should look similar to the following: Buildfile: build.xml

BUILD SUCCESSFUL
Total time: 8 seconds

During the build process, a new directory is created in your GT4Services directory that is
named build. All of your stubs and class files that were generated will be in that directory and its
subdirectories.
More importantly, there is a GAR (Grid Archive) file called
org_globus_examples_services_core_first.gar. The GAR file is the package that contains every
file that is needed to successfully deploy your Math Service into the Globus container.
The files contained in the GAR file are the Java class files, WSDL, compiled stubs, and the
deployment descriptor. Step 3 – Deploying the Math Service If the container is still running in
the Container Window, then stop it using Control-C.
To deploy the Math Service, you will use a tool provided by the Globus Toolkit called
globus-deploy-gar. In the Container Window, issue the command: globus-deploy-gar
org_globus_examples_services_core_first.gar Successful output of the command is :

The service has now been deployed.


Check service is deployed by starting container from the Container Window:

You should see the service called MathService.

Step 4 – Compiling the Client A client has already been provided to test the Math Service and is
located in the GT4Services directory.

Step 5 – Start the Container for your Service Restart the Globus container from the Container
Window with: globus-start-container -nosec if the container is not running.

Step 6 – Run the Client To start the client from your GT4Services directory, do the following in
the Client Window, which passes the GSH of the service as an argument:
java -classpath build\classes\org\globus\examples\services\core\first\impl\:%CLASSPATH%
org.globus.examples.clients.MathService_instance.
Client http://localhost:8080/wsrf/services/examples/core/first/MathService
which should give the output:
Current value: 15
Current value: 10
Step 7 – Undeploy the Math Service and Kill a Container Before we can add functionality to the
Math Service (Section 5), we must undeploy the service.
In the Container Window, kill the container with a Control-C.
Then to undeploy the service, type in the following command:
globus-undeploy-gar org_globus_examples_services_core_first
which should result with the following output:
Undeploying gar... Deleting /.
Undeploy successful

Result:
Thus the develop new web service for calculator was successfully completed and output
verified.
Ex. No:2 Developing New Grid Service
Date :

Aim:
` To Develop new Grid Service
Procedure :

1. Setting up Eclipse, GT4, Tomcat, and the other necessary plug-ins and tools
2. Creating and configuring the Eclipse project in preparation for the source files
3. Adding the source files (and reviewing their major features)
4. Creating the build/deploy Launch Configuration that orchestrates the automatic
generation of the remaining artifacts, assembling the GAR, and deploying the grid service
into the Web services container
5. Using the Launch Configuration to generate and deploy the grid service
6. Running and debugging the grid service in the Tomcat container
7. Executing the test client
8. To test the client, simply right-click the Client.java file and select Run > Run... from the
pop-up menu (See Figure 27).
9. In the Run dialog that is displayed, select the Arguments tab and enter
http://127.0.0.1:8080/wsrf/services/examples/ProvisionDirService in the Program
Arguments: textbox.
10. Run dialog
11. Run the client application by simply right-clicking the Client.java file and selecting Run
> Java Application

Output

Run Java Application

Result:
Thus the developing new grid service was successfully completed and output verified.
Ex. No:3 Develop applications using Java - Grid APIs
Date :

Aim:
To develop Applications using Java – Grid APIs

Procedure:
1. Build a server-side SOAP service using Tomcat and Axis

2. Create connection stubs to support client-side use of the SOAP service

3. Build a custom client-side ClassLoader

4. Build the main client application

5. Build a trivial compute task designed to exercise the client ClassLoader

6. Test the grid computing framework

Output

Result:
Thus the develop applications using java-grid APIs was successfully completed and
output verified.
Ex. No:4 Develop secured applications using basic security in Globus
Date :

Aim:
To Develop secured applications using basic security in Globus.
Procedure:
Mandatory prerequisite:

 Tomcat v4.0.3
 Axis beta 1
 Commons Logging v1.0
 Java CoG Kit v0.9.12
 Xerces v2.0.1

1.Installing the software Tomcat and deploy Axis on Tomcat

2. Install libraries to provide GSI support for Tomcat

 Copy cog.jar, cryptix.jar, iaik_ssl.jar, iaik_jce_full.jar, iaik_javax_crypto.jar to Tomcat's


common/lib directory.
 Check that log4j-core.jar and xerces.jar (or other XML parser) are in Tomcat's
common/lib directory.
 Copy gsicatalina.jar to Tomcat's server/lib directory.

3. Deploy GSI support in Tomcat

 Edit Tomcat's conf/server.xml

Add GSI Connector in <service> section:

<!-- Define a GSI HTTP/1.1 Connector on port 8443


Supported parameters include:
proxy // proxy file for server to use
or
cert // server certificate file in PEM format
key // server key file in PEM format

cacertdir // directory location containing trusted CA certs


gridMap // grid map file used for authorization of users
debug // "0" is off and "1" and greater for more info
-->
<Connector className="org.apache.catalina.connector.http.HttpConnector"
port="8443" minProcessors="5" maxProcessors="75"
enableLookups="true" authenticate="true"
acceptCount="10" debug="1" scheme="httpg" secure="true">
<Factory
className="org.globus.tomcat.catalina.net.GSIServerSocketFactory"
cert="/etc/grid-security/hostcert.pem"
key="/etc/grid-security/hostkey.pem"
cacertdir="/etc/grid-security/certificates"
gridmap="/etc/grid-security/gridmap-file"
debug="1"/>
</Connector>
If you are testing under a user account, make sure that the proxy or certificates and keys are
readable by Tomcat. For testing purposes you can use user proxies or certificates instead of host
certificates e.g.:
<Connector className="org.apache.catalina.connector.http.HttpConnector"
port="8443" minProcessors="5" maxProcessors="75"
enableLookups="true" authenticate="true"
acceptCount="10" debug="1" scheme="httpg" secure="true">
<Factory
className="org.globus.tomcat.catalina.net.GSIServerSocketFactory"
proxy="/tmp/x509u_up_neilc"
debug="1"/>
</Connector>
If you do test using user proxies, make sure the proxy has not expired!
Add a GSI Valve in the <engine> section:

<Valve className="org.globus.tomcat.catalina.valves.CertificatesValve"
debug="1" />

4. Install libraries to provide GSI support for Axis

 Copy gsiaxis.jar to the WEB-INF/lib directory of your Axis installation under Tomcat.

5. Set your CLASSPATH correctly

 You should ensure that the following jars from the axis/lib directory are in your
classpath:
o axis.jar
o clutil.jar
o commons-logging.jar
o jaxrpc.jar
o log4j-core.jar
o tt-bytecode.jar
o wsdl4j.jar
 You should also have these jars in your classpath:
o gsiaxis.jar
o cog.jar
o xerces.jar (or other XML parser)

6. Start the GSI enabled Tomcat/Axis server

 Start up Tomcat as normal

Check the logs in Tomcat's logs/ directory to ensure the server started correctly. In particular
check that:

 apache_log.YYYY-MM-DD.txt does not contain any GSI related error messages


 catalina.out contains messages saying "Welcome to the IAIK ... Library"
 catalina_log.YYYY-MM-DD.txt contains messages saying "HttpConnector[8443]
Starting background thread" and "HttpProcessor[8443][N] Starting background thread"
 localhost_log.YYYY-MM-DD.txt contains a message saying "WebappLoader[/axis]:
Deploy JAR /WEB-INF/lib/gsiaxis.jar"

7. Writing a GSI enabled Web Service

7.1. Implementing the service

The extensions made to Tomcat allow us to receive credentials through a transport-


level security mechanism. Tomcat exposes these credentials, and Axis makes them available as
part of the MessageContext.

Alpha 3 version

Let's assume we already have a web service called MyService with a single method,
myMethod. When a SOAP message request comes in over the GSI httpg transport, the Axis RPC
despatcher will look for the same method, but with an additional parameter: the MessageContext.
So we can write a new myMethod which takes an additional argument, the MessageContext.

This can be illustrated in the following example:

package org.globus.example;
import org.apache.axis.MessageContext;
import org.globus.axis.util.Util;
public class MyService {
// The "normal" method
public String myMethod(String arg) {
System.out.println("MyService: http request\n");
System.out.println("MyService: you sent " + arg);
return "Hello Web Services World!";
}
// Add a MessageContext argument to the normal method
public String myMethod(MessageContext ctx, String arg) {
System.out.println("MyService: httpg request\n");
System.out.println("MyService: you sent " + arg);
System.out.println("GOT PROXY: " + Util.getCredentials(ctx));
return "Hello Web Services World!";
}
}

Beta 1 version

In the Beta 1 version, you don't even need to write a different method. Instead the
Message Context is put on thread local store.
This can be retrieved by calling MessageCOntext.getCurrentContext():

package org.globus.example;
import org.apache.axis.MessageContext;
import org.globus.axis.util.Util;
public class MyService {
// Beta 1 version
public String myMethod(String arg) {
System.out.println("MyService: httpg request\n");
System.out.println("MyService: you sent " + arg);
// Retrieve the context from thread local
MessageContext ctx = MessageContext.getCurrentContext();
System.out.println("GOT PROXY: " + Util.getCredentials(ctx));
return "Hello Web Services World!";
}
}
Part of the code provided by ANL in gsiaxis.jar is a utility package which includes the
getCredentials() method. This allows the service to extract the proxy credentials from the
MessageContext.
7.2. Deploying the service

Before the service can be used it must be made available. This is done by deploying the service.
This can be done in a number of ways:

1. Use the Axis AdminClient to deploy the MyService classes.


2. Add the following entry to the server-config.wsdd file in WEB-INF directory of axis on
Tomcat:

3. <service name="MyService" provider="java:RPC">


4. <parameter name="methodName" value="*"/>
5. <parameter name="className" value="org.globus.example.MyService"/>
6. </service>

8. Writing a GSI enabled Web Service client

As in the previous example, this is very similar to writing a normal web services client. There are
some additions required to use the new GSI over SSL transport:

 Deploy a httpg transport chain


 Use the Java CoG kit to load a Globus proxy
 Use setProperty() to set GSI specifics in the Axis "Property Bag":
o globus credentials (the proxy certificate)
o authorisation type
o GSI mode (SSL, no delegation, full delegation, limited delegation)
 Continue with the normal Axis SOAP service invokation:
o Set the target address for the service
o Provide the name of the method to be invoked
o Pass on any parameters required
o Set the type of the returned value
o Invoke the service

You can invoke is client by running:


java org.globus.example.Client -l
httpg://127.0.0.1:8443/axis/servlet/AxisServlet "Hello!"

Result:

Ex. No:5 Develop a Grid portal


Date :
Thus the develop secured applications using basic security in globus was successfully
completed and output verified.

Aim :
To Develop a Grid portal, where user can submit a job and get the result. Implement it
with and without GRAM concept.
Procedure:
1) Building the GridSphere distribution requires 1.5+. You will also need Ant 1.6+ available at
http://jakarta.apache.org/ant.
2) You will also need a Tomcat 5.5.x servlet container available at
http://jakarta.apache.org/tomcat. In addition to providing a hosting environment for GridSphere,
Tomcat provides some of the required XML (JAR) libraries that are needed for compilation.
3) Compiling and Deploying
4) The Ant build script, build.xml, uses the build.properties file to specify any compilation
options. Edit build.properties appropriately for your needs.
5) At this point, simply invoking "ant install" will deploy the GridSphere portlet container to
Tomcat using the default database. Please see the User Guide for more details on configuring the
database.
6) The build.xml supports the following basic tasks:
install -- builds and deploys GridSphere, makes the documentation and installs the database
clean -- removes the build and dist directories including all the compiled classes
update -- updates the existing source code from CVS
compile -- compiles the GridSphere source code
deploy -- deploys the GridSphere framework and all portlets to a Tomcat servlet container
located at $CATALINA_HOME
create-database - creates a new, fresh database with original GridSphere settings, this wipes out
your current database
docs -- builds the Javadoc documentation from the source code
To see all the targets invoke "ant --projecthelp".
7) Startup Tomcat and then go to http://127.0.0.1:8080/gridsphere/gridsphere to see the portal.

CLOUD COMPTUING LAB


Mandatory prerequisite:
Linux 64 bit Operating
Installing KVM (Hypervisor for Virtualization)
1. Please check if the Virtualization flag is enabled in BIOS
Run the command in terminal
egrep -c 'vmx|svm)' /proc/cpuinfo

If the result is any value higher than 0, then virtualization is enabled.


If the value is 0, then in BIOS enable Virtualization – Consult system
administrator
for this step.
2. To check if your OS is 64 bit,
Run the command in terminal

uname -m

If the result is x86_64, it means that your Operating system is 64 bit Operating
system.
3. Few KVM packages are availabe with Linux installation.
To check this, run the command,
ls /lib/modules/{press tab}/kernel/arch/x86/kvm
The three files which are installed in your system will be displayed
kvm-amd.ko kvm-intel.ko kvm.ko
4. Install the KVM packages
1. Switch to root (Administrator) user
sudo -i
2. To install the packages, run the following commands,
apt-get update
apt-get install qemu-kvm
apt-get install libvirt-bin
apt-get install bridge-utils
apt-get install virt-manager
apt-get install qemu-system
5. To verify your installation, run the command
virsh -c qemu:///system list
it shows output:
Id Name State
-------------------------------------------
If Vms are running, then it shows name of VM. If VM is not runnign, the system shows blank
output, whcih means your KVM installation is perfect.
6. Run the command
virsh –connect qemu:///system list –all
7. Working with KVM
run the command
virsh
version (this command displays version of software tools installed)
nodeinfo (this command displays your system information)
quit (come out of the system)
8. To test KVM installation - we can create Virtual machines but these machines are to be done
in manual mode. Skipping this, Directly install Openstack.

Installation of Openstack
1. add new user named stack – This stack user is the adminstrator of the openstack services.
To add new user – run the command as root user.
adduser stack
2. run the command
apt-get install sudo -y || install -y sudo
3. Be careful in running the command – please be careful with the syntax. If any error in thsi
following command, the system will crash beacause of permission errors.
echo “stack ALL=(ALL) NOPASSWD:ALL” >> /etc/sudoers
4. Logout the system and login as stack user
5. Run the command (this installs git repo package)
Please ensure that you are as logged in as non-root user (stack user), and not in /root
directory.
sudo apt-get install git
6. Run the command (This clones updatesd version of dev-stack (which is binary auto-installer
package of Openstack)
git clone https://git.openstack.org/openstack-dev/devstack
ls (this shows a folder named devstack)
cd devstack (enter into the folder)
7. create a file called local.conf. To do this run the command,
nano local.conf
8. In the file, make the following entry (Contact Your Network Adminstrator for doubts in these
values)
[[local|localrc]]
FLOATING_RANGE=192.168.1.224/27
FIXED_RANGE=10.11.11.0/24
FIXED_NETWORK_SIZE=256
FLAT_INTERFACE=eth0
ADMIN_PASSWORD=root
DATABASE_PASSWORD=root
RABBIT_PASSWORD=root
SERVICE_PASSWORD=root
SERVICE_TOCKEN=root
9. Save this file
10. Run the command (This installs Opentack)
./stack.sh
11. If any error occurs, then run the command for uninistallation
./unstack.sh
1. update the packages
apt-get update

2. Then reinstall the package


./stack.sh
12. Open the browser, http://IP address of your machine, you will get the openstack portal.
13. If you restart the machine, then to again start open stack
open terminal,
su stack
cd devstack
run ./rejoin.sh
14. Again you can access openstack services in the browser, http://IP address of your machine.
Ex. No:1 Procedure to run the virtual machine
Date :

Aim :
To Find procedure to run the virtual machine of different configuration and to check
how many Virtual machines can be utilized at particular time.
Procedure:
This experiment is to be performed through portal. Login into Openstack portal, in instances,
create virtual machines.
TO RUN VM
Step 1 : Under the Project Tab, Click Instances. In the right side screen Click Launch Instance.
Step 2 : In the details, Give the instance name(eg. Instance1).
Step 3: Click Instance Boot Source list and choose 'Boot from image'
Step 4: Click Image name list and choose the image currently uploaded.
Step 5: Click launch.
Your VM will get created.
Ex. No:2
Procedure to attach virtual block to the virtual machine
Date :

Aim:
To find procedure to attach virtual block to the virtual machine and check whether it
holds the data even after the release of the virtual machine.
Procedure:
  This experiment is to be performed through portal. Login into Openstack portal, in instances,
create virtual machines.
  In Volumes, create storage block of available capacity. Attach / Mount the storage block
volumes to virtual machines, unmount the volume and reattach it.
  Volumes are block storage devices that you attach to instances to enable persistent storage.
You can attach a volume to a running instance or detach a volume and attach it to another
instance at any time. You can also create a snapshot from or delete a volume. Only
administrative users can create volume types.
Create a volume
1. Log in to the dashboard.

2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Compute tab and click Volumes category.

4. Click Create Volume.

In the dialog box that opens, enter or select the following values.

Volume Name: Specify a name for the volume.

Description: Optionally, provide a brief description for the volume.

Volume Source: Select one of the following options:

o No source, empty volume: Creates an empty volume. An empty volume does not
contain a file system or a partition table.
o Image: If you choose this option, a new field for Use image as a source displays.
You can select the image from the list.
o Volume: If you choose this option, a new field for Use volume as a
source displays. You can select the volume from the list. Options to use a
snapshot or a volume as the source for a volume are displayed only if there are
existing snapshots or volumes.

Type: Leave this field blank.

Size (GB): The size of the volume in gibibytes (GiB).

Availability Zone: Select the Availability Zone from the list. By default, this value is set
to the availability zone given by the cloud provider (for example, us-west or apac-south). For
some cases, it could be nova.

5. Click Create Volume.

The dashboard shows the volume on the Volumes tab.

Attach a volume to an instance

After you create one or more volumes, you can attach them to instances. You can attach a
volume to one instance at a time.

1. Log in to the dashboard.

2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Compute tab and click Volumes category.

4. Select the volume to add to an instance and click Manage Attachments.

5. In the Manage Volume Attachments dialog box, select an instance.

6. Enter the name of the device from which the volume is accessible by the instance.

7. Click Attach Volume.

The dashboard shows the instance to which the volume is now attached and the device
name.
You can view the status of a volume in the Volumes tab of the dashboard. The volume is either
Available or In-Use.

Now you can log in to the instance and mount, format, and use the disk.

Detach a volume from an instance

1. Log in to the dashboard.


2. Select the appropriate project from the drop down menu at the top left.
3. On the Project tab, open the Compute tab and click the Volumes category.
4. Select the volume and click Manage Attachments.
5. Click Detach Volume and confirm your changes.

A message indicates whether the action was successful.

Create a snapshot from a volume

1. Log in to the dashboard.

2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Compute tab and click Volumes category.

4. Select a volume from which to create a snapshot.

5. In the Actions column, click Create Snapshot.

6. In the dialog box that opens, enter a snapshot name and a brief description.

7. Confirm your changes.

The dashboard shows the new volume snapshot in Volume Snapshots tab.

Edit a volume

1. Log in to the dashboard.


2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Compute tab and click Volumes category.

4. Select the volume that you want to edit.

5. In the Actions column, click Edit Volume.

6. In the Edit Volume dialog box, update the name and description of the volume.

7. Click Edit Volume.

Delete a volume

When you delete an instance, the data in its attached volumes is not deleted.

1. Log in to the dashboard.

2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Compute tab and click Volumes category.

4. Select the check boxes for the volumes that you want to delete.

5. Click Delete Volumes and confirm your choice.

A message indicates whether the action was successful.

Install a C compiler in the virtual machine and execute a


Ex. No:3
sample program.

Aim :

To find procedure to attach virtual block to the virtual machine and check whether it holds
the data even after the release of the virtual machine.

Procedure:
1. Install a C compiler in the virtual machine and execute a sample program.
Through Openstack portal create virtual machine. Through the portal connect to virtual
machines. Login to VMs and install c compiler using commands.
Eg : apt-get install gcc

2. Show the virtual machine migration based on the certain condition from one node to the other.

To demonstrate virtual machine migration, two machines must be configured in one cloud. Take
snapshot of running virtual machine and copy the snapshot file to the other destination machine
and restore the snapshot. On restoring the snapshot, VM running in source will be migrated to
destination machine.

1. List the VMs you want to migrate, run:

$ nova list

2. After selecting a VM from the list, run this command where VM_ID is set to the ID in the
list returned in the previous step:

$ nova show VM_ID

3. Use the nova migrate command.

$ nova migrate VM_ID

4. To migrate an instance and watch the status, use this example script:

#!/bin/bash

# Provide usage

usage() {

echo "Usage: $0 VM_ID"

exit 1
}

[[ $# -eq 0 ]] && usage

# Migrate the VM to an alternate hypervisor

echo -n "Migrating instance to alternate host"

VM_ID=$1

nova migrate $VM_ID

VM_OUTPUT=`nova show $VM_ID`

VM_STATUS=`echo "$VM_OUTPUT" | grep status | awk '{print $4}'`

while [[ "$VM_STATUS" != "VERIFY_RESIZE" ]]; do

echo -n "."

sleep 2

VM_OUTPUT=`nova show $VM_ID`

VM_STATUS=`echo "$VM_OUTPUT" | grep status | awk '{print $4}'`

done

nova resize-confirm $VM_ID

echo " instance migrated and resized."

echo;

# Show the details for the VM

echo "Updated instance details:"

nova show $VM_ID

# Pause to allow users to examine VM details

read -p "Pausing, press <enter> to exit."


Result :
Thus the install a C compiler in the virtual machine and execute a sample program was
successfully completed and output verified.

Ex. No:4 Procedure to install storage controller and interact with it

Aim :

To find procedure to install storage controller and interact with it.


Procedure:

Storage controller will be installed as Swift and Cinder components when installing Openstack.
The ways to interact with the storage will be done through portal.

OpenStack Object Storage (swift) is used for redundant, scalable data storage using clusters of
standardized servers to store petabytes of accessible data. It is a long-term storage system for
large amounts of static data which can be retrieved and updated.

OpenStack Object Storage provides a distributed, API-accessible storage platform that can be
integrated directly into an application or used to store any type of file, including VM images,
backups, archives, or media files. In the OpenStack dashboard, you can only manage containers
and objects.

In OpenStack Object Storage, containers provide storage for objects in a manner similar to a
Windows folder or Linux file directory, though they cannot be nested. An object in OpenStack
consists of the file to be stored in the container and any accompanying metadata.

Create a container
Log in to the dashboard

1. Select the appropriate project from the drop down menu at the top left.
2. On the Project tab, open the Object Store tab and click Containers category.
3. Click Create Container.
4. In the Create Container dialog box, enter a name for the container, and then click Create
Container.

You have successfully created a container.

Upload an object

1. Log in to the dashboard.

2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Object Store tab and click Containers category.

4. Select the container in which you want to store your object.

5. Click Upload Object.

The Upload Object To Container: <name> dialog box appears. <name> is the name of the
container to which you are uploading the object.

6. Enter a name for the object.

7. Browse to and select the file that you want to upload.

8. Click Upload Object.

You have successfully uploaded an object to the container.

Manage an object

To edit an object
1. Log in to the dashboard.

2. Select the appropriate project from the drop down menu at the top left.

3. On the Project tab, open the Object Store tab and click Containers category.

4. Select the container in which you want to store your object.

5. Click the menu button and choose Edit from the dropdown list.

The Edit Object dialog box is displayed.

6. Browse to and select the file that you want to upload.

7. Click Update Object.

Result:
Thus the procedure to install storage controller and interact with it was successfully
completed and output verified.
Ex. No:5 Procedure to set up the one node Hadoop cluster

Aim :

To find procedure to install storage controller and interact with it.


Procedure:

Mandatory prerequisite:

 Linux 64 bit Operating System


 Installing Java v1.8

 Configuring SSH access.

sudo apt-get install vim

1) Installing Java:

Hadoop is a framework written in Java for running applications on large clusters of commodity
hardware. Hadoop needs Java 6 or above to work.
Step 1: Download Jdk tar.gz file for linux-62 bit, extract it into “/usr/local”

boss@solaiv[]# cd /opt

boss@solaiv[]# sudo tar xvpzf/home/itadmin/Downloads/jdk-8u5-linux-x64.tar.gz


boss@solaiv[]# cd /opt/jdk1.8.0_05

Step 2:

Open the “/etc/profile” file and Add the following line as per the version

set a environment for Java

Use the root user to save the /etc/proflie or use gedit instead of vi .
The 'profile' file contains commands that ought to be run for login shells
boss@solaiv[]# sudo vi /etc/profile

#--insert JAVA_HOME

JAVA_HOME=/opt/jdk1.8.0_05
#--in PATH variable just append at the end of the line PATH=$PATH:$JAVA _HOME/bin

#--Append JAVA_HOME at end of the export statement export PATH JAVA_HOME

save the file using by pressing “Esc” key followed by :wq!

Step 3: Source the /etc/profileboss@solaiv[]# source /etc/profile


Step 3: Update the java alternatives

By default OS will have a open jdk. Check by “java -version”. You will be prompt “openJDK”

If you also have openjdk installed then you'll need to update the java alternatives:

If your system has more than one version of Java, configure which one your system causes by
entering the following command in a terminal window
By default OS will have a open jdk. Check by “java -version”. You will be prompt “Java
HotSpot(TM) 64-Bit Server”

boss@solaiv[]# update-alternatives --install "/usr/bin/java" java "/opt/jdk1.8.0_05/bin/java" 1

boss@solaiv[]# update- alternatives --config java --type selection number:

boss@solaiv[]# java -version

2) configure ssh
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine
if you want to use Hadoop on it (which is what we want to do in this short tutorial). For our
single-node setup of Hadoop, we therefore need to configure SSH access to localhos
The need to create a Password-less SSH Key generation based authentication is sothat the master
node can then login to slave nodes (and the secondary node) to start/stop them easily without any
delays for authentication

If you skip this step, then have to provide password

Generate an SSH key for the user. Then Enable password-less SSH access to yo

sudo apt-get install openssh-server

--You will be asked to enter password, root@solaiv[]# ssh localhost

root@solaiv[]# ssh-keygen root@solaiv[]# ssh-copy-id -i localhost

--After above 2 steps, You will be connected without

password, root@solaiv[]# ssh localhost

root@solaiv[]# exit

3) Hadoop installation
Now Download Hadoop from the official Apache, preferably a stable release version of
Hadoop 2.7.x and extract the contents of the Hadoop package to a location of your choice.

We chose location as “/opt/”

Step 1: Download the tar.gz file of latest version Hadoop ( hadoop-2.7.x) from the official site .
Step 2: Extract(untar) the downloaded file from this commands to /opt/bigdata

root@solaiv[]# cd /opt
root@solaiv[/opt]# sudo tar xvpzf /home/itadmin/Downloads/hadoop-2.7.0.tar.gz
root@solaiv[/opt]# cd hadoop-2.7.0/

Like java, update Hadop environment variable in /etc/profile

boss@solaiv[]# sudo vi /etc/profile


#--insert HADOOP_PREFIX HADOOP_PREFIX=/opt/hadoop-2.7.0

#--in PATH variable just append at the end of the line PATH=$PATH:$HADOOP_PREFIX/bin

#--Append HADOOP_PREFIX at end of the export statement export PATH JAVA_HOME


HADOOP_PREFIX

Save the file using by pressing “Esc” key followed by :wq!

Step 3: Source the /etc/profileboss@solaiv[]# source /etc/profile

Verify Hadoop installation

boss@solaiv[]# cd $HADOOP_PREFIX

boss@solaiv[]# bin/hadoop version

3.1) Modify the Hadoop Configuration Files


In this section, we will configure the directory where Hadoop will store its configuration files,
the network ports it listens to, etc. Our setup will use Hadoop Distributed File System,(HDFS),
even though we are using only a single local machine.

Add the following properties in the various hadoop configuration files which is available under
$HADOOP_PREFIX/etc/hadoop/ core-site.xml, hdfs-site.xml, mapred-site.xml & yarn-site.xml

Update Java, hadoop path to the Hadoop environment file

boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop

boss@solaiv[]# vi hadoop-env.sh
Paste following line at beginning of the fileexport

JAVA_HOME=/usr/local/jdk1.8.0_05 export HADOOP_PREFIX=/opt/hadoop-2.7.0

Modify the core-site.xml

boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop boss@solaiv[]# vi core-site.xml


Paste following between <configuration> tags

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>
Modify the hdfs-site.xml

boss@solaiv[]# vi hdfs-site.xml

Paste following between <configuration> tags

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>
</configuration>

YARN configuration - Single Node modify the mapred-site.xml

boss@solaiv[]# cp mapred-site.xml.template mapred-site.xml

boss@solaiv[]# vi mapred-site.xml

Paste following between <configuration> tags

<configuration>

<property>

<name>mapreduce.framework.name</name>
<value>yarn</value>

</property>

</configuration>

Modiy yarn-site.xml

boss@solaiv[]# vi yarn-site.xml

Paste following between <configuration> tags

<configuration>

<property><name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value></property>

</configuration>

Formatting the HDFS file-system via the NameNode

The first step to starting up your Hadoop installation is formatting the Hadoop files system which
is implemented on top of the local file system of our “cluster” which includes only our local
machine. We need to do this the first time you set up a Hadoop cluster. Do not format a running
Hadoop file system as you will lose all the data currently in the cluster (in HDFS)

root@solaiv[]# cd $HADOOP_PREFIX root@solaiv[]# bin/hadoop namenode -format

Start NameNode daemon and DataNode daemon: (port 50070)

root@solaiv[]# sbin/start-dfs.sh

To know the running daemons jut type jps or /usr/local/jdk1.8.0_05/bin/jps Start


ResourceManager daemon and NodeManager daemon: (port 8088)
root@solaiv[]# sbin/start-yarn.sh
To stop the running process

root@solaiv[]# sbin/stop-dfs.sh

To know the running daemons jut type jps or /usr/local/jdk1.8.0_05/bin/jps Start


ResourceManager daemon and NodeManager daemon: (port 8088)

root@solaiv[]# sbin/stop-yarn.sh

Result :

Thus the procedure to set up the one node Hadoop cluster was successfully completed
ant output verified.
Ex. No:6 Mount the one node Hadoop cluster using FUSE

Introduction
FUSE (Filesystem in Userspace) enables you to write a normal user application as a bridge for a
traditional filesystem interface.
The hadoop-hdfs-fuse package enables you to use your HDFS cluster as if it were a traditional
filesystem on Linux. It is assumed that you have a working HDFS cluster and know the
hostname and port that your NameNode exposes.
Aim :

To Mount the one node Hadoop cluster using FUSE.


Procedure:

1. To install fuse-dfs on Ubuntu systems:

sudo apt-get install hadoop-hdfs-fuse

2. To set up and test your mount point:

mkdir -p <mount_point>
hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port><mount_point>
You can now run operations as if they are on your mount point. Press Ctrl+C to end the fuse-dfs
program, and umount the partition if it is still mounted.
Note:

 To find its configuration directory, hadoop-fuse-dfs uses the HADOOP_CONF_DIR


configured at the time the mount command is invoked.
 If you are using SLES 11 with the Oracle JDK 6u26 package, hadoop-fuse-dfs may exit
immediately because ld.so can't find libjvm.so. To work around this issue, add
/usr/java/latest/jre/lib/amd64/server to the LD_LIBRARY_PATH.
3. To clean up your test:

$ umount<mount_point>
You can now add a permanent HDFS mount which persists through reboots.

4. To add a system mount:

1. Open /etc/fstab and add lines to the bottom similar to these:

hadoop-fuse-dfs#dfs://<name_node_hostname>:<namenode_port><mount_point> fuse
allow_other,usetrash,rw 2 0
For example:

hadoop-fuse-dfs#dfs://localhost:8020 /mnt/hdfs fuse allow_other,usetrash,rw 2 0

2. Test to make sure everything is working properly:

$ mount <mount_point>
Your system is now configured to allow you to use the ls command and use that mount point as
if it were a normal system disk.
Result :
Thus the Mount the one Hadoop cluster using FUSE was successfully completed and
output verified.

Ex. No: 7 Program to use the API’s of Hadoop to interact with it.
Aim :

To write a program for electrical consumption of all the largescale industries of a


particular state to use the API’s of Hadoop to interact with it.

Procedure:
1. Given below is the data regarding the electrical consumption of an organization. It
contains the monthly electrical consumption and the annual average for various years.
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Avg

1979 23 23 2 43 24 25 26 26 26 26 25 26 25

1980 26 27 28 28 28 30 31 31 31 30 30 30 29

1981 31 32 32 32 33 34 35 36 36 34 34 34 34

1984 39 38 39 39 39 41 42 43 40 39 38 38 40

1985 38 39 39 39 39 41 41 41 00 40 39 39 45

If the above data is given as input, we have to write applications to process it and produce results
such as finding the year of maximum usage, year of minimum usage, and so on. This is a
walkover for the programmers with finite number of records. They will simply write the logic to
produce the required output, and pass the data to the application written.

But, think of the data representing the electrical consumption of all the large scale industries of a
particular state, since its formation.

When we write applications to process such bulk data,

 They will take a lot of time to execute.

 There will be a heavy network traffic when we move data from source to network server
and so on.

To solve these problems, we have the MapReduce framework.

2. The above data is saved as sample.txtand given as input. The input file looks as shown below.

1979 23 23 2 43 24 25 26 26 26 26 25 26 25

1980 26 27 28 28 28 30 31 31 31 30 30 30 29
1981 31 32 32 32 33 34 35 36 36 34 34 34 34

1984 39 38 39 39 39 41 42 43 40 39 38 38 40

1985 38 39 39 39 39 41 41 41 00 40 39 39 45

3. Write a program to the sample data using MapReduce framework and Save the above program
as ProcessUnits.java. The compilation and execution of the program is explained below.

4. Compilation and Execution of Process Units Program


Let us assume we are in the home directory of a Hadoop user (e.g. /home/hadoop).

Follow the steps given below to compile and execute the above program.

Step 1
The following command is to create a directory to store the compiled java classes.

$ mkdir units

Step 2
Download Hadoop-core-1.2.1.jar, which is used to compile and execute the MapReduce
program. Visit the following linkhttp://mvnrepository.com/artifact/org.apache.hadoop/hadoop-
core/1.2.1 to download the jar. Let us assume the downloaded folder is /home/hadoop/.

Step 3
The following commands are used for compiling the ProcessUnits.javaprogram and creating a
jar for the program.

$ javac -classpath hadoop-core-1.2.1.jar -d units ProcessUnits.java

$ jar -cvf units.jar -C units/ .

Step 4
The following command is used to create an input directory in HDFS.

$HADOOP_HOME/bin/hadoop fs -mkdir input_dir

Step 5
The following command is used to copy the input file named sample.txtin the input directory of
HDFS.

$HADOOP_HOME/bin/hadoop fs -put /home/hadoop/sample.txt input_dir

Step 6
The following command is used to verify the files in the input directory.

$HADOOP_HOME/bin/hadoop fs -ls input_dir/

Step 7
The following command is used to run the Eleunit_max application by taking the input files from
the input directory.

$HADOOP_HOME/bin/hadoop jar units.jar hadoop.ProcessUnits input_dir output_dir

Wait for a while until the file is executed. After execution, as shown below, the output will
contain the number of input splits, the number of Map tasks, the number of reducer tasks, etc.

INFO mapreduce.Job: Job job_1414748220717_0002

completed successfully

14/10/31 06:02:52

INFO mapreduce.Job: Counters: 49

File System Counters

FILE: Number of bytes read=61

FILE: Number of bytes written=279400

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=546

HDFS: Number of bytes written=40


HDFS: Number of read operations=9

HDFS: Number of large read operations=0

HDFS: Number of write operations=2 Job Counters

Launched map tasks=2

Launched reduce tasks=1

Data-local map tasks=2

Total time spent by all maps in occupied slots (ms)=146137

Total time spent by all reduces in occupied slots (ms)=441

Total time spent by all map tasks (ms)=14613

Total time spent by all reduce tasks (ms)=44120

Total vcore-seconds taken by all map tasks=146137

Total vcore-seconds taken by all reduce tasks=44120

Total megabyte-seconds taken by all map tasks=149644288

Total megabyte-seconds taken by all reduce tasks=45178880

Map-Reduce Framework

Map input records=5

Map output records=5

Map output bytes=45

Map output materialized bytes=67

Input split bytes=208

Combine input records=5

Combine output records=5


Reduce input groups=5

Reduce shuffle bytes=6

Reduce input records=5

Reduce output records=5

Spilled Records=10

Shuffled Maps =2

Failed Shuffles=0

Merged Map outputs=2

GC time elapsed (ms)=948

CPU time spent (ms)=5160

Physical memory (bytes) snapshot=47749120

Virtual memory (bytes) snapshot=2899349504

Total committed heap usage (bytes)=277684224

File Output Format Counters

Bytes Written=40

Step 8
The following command is used to verify the resultant files in the output folder.

$HADOOP_HOME/bin/hadoop fs -ls output_dir/

Step 9
The following command is used to see the output in Part-00000 file. This file is generated by
HDFS.

$HADOOP_HOME/bin/hadoop fs -cat output_dir/part-00000

Below is the output generated by the MapReduce program.


1981 34

1984 40

1985 45

Step 10
The following command is used to copy the output folder from HDFS to the local file system for
analyzing.

$HADOOP_HOME/bin/hadoop fs -cat output_dir/part-00000/bin/hadoop dfs get output_dir /home/hadoop

Important Commands
All Hadoop commands are invoked by the $HADOOP_HOME/bin/hadoopcommand. Running
the Hadoop script without any arguments prints the description for all commands.

Usage : hadoop [--config confdir] COMMAND

5. Interact with MapReduce Jobs


Usage: hadoop job [GENERIC_OPTIONS]

The following are the Generic Options available in a Hadoop job.

GENERIC_OPTIONS Description

-submit <job-file> Submits the job.

-status <job-id> Prints the map and reduce completion


percentage and all job counters.

-counter <job-id> <group- Prints the counter value.


name> <countername>

-kill <job-id> Kills the job.


-events <job-id> Prints the events' details received by
<fromevent-#> <#-of- jobtracker for the given range.
events>

-history [all] Prints job details, failed and killed tip


<jobOutputDir> - history < details. More details about the job
jobOutputDir> such as successful tasks and task
attempts made for each task can be
viewed by specifying the [all] option.

-list[all] Displays all jobs. -list displays only


jobs which are yet to complete.

-kill-task <task-id> Kills the task. Killed tasks are NOT


counted against failed attempts.

-fail-task <task-id> Fails the task. Failed tasks are counted


against failed attempts.

-set-priority <job-id> Changes the priority of the job.


<priority> Allowed priority values are
VERY_HIGH, HIGH, NORMAL,
LOW, VERY_LOW

To see the status of job


$ $HADOOP_HOME/bin/hadoop job -status <JOB-ID>

e.g.

$ $HADOOP_HOME/bin/hadoop job -status job_201310191043_0004

To see the history of job output-dir


$ $HADOOP_HOME/bin/hadoop job -history <DIR-NAME>

e.g.

$ $HADOOP_HOME/bin/hadoop job -history /user/expert/output

To kill the job


$ $HADOOP_HOME/bin/hadoop job -kill <JOB-ID>

e.g.

$ $HADOOP_HOME/bin/hadoop job -kill job_201310191043_0004

Output

1981 34
1984 40
1985 45
Result:
Thus the program to use the API’s of Hadoop to interact with it was successfully
completed and output verified.

Word count program to demonstrate the use of Map and


Ex. No: 8
Reduce tasks

Aim :
To Write a wordcount program to demonstrate the use of Map and Reduce tasks

Procedure:
WordCount is a simple application that counts the number of occurrences of each word in a given
input set.

1. Write source code in java for which includes wordcount logic

2. Assuming environment variables are set as follows:

export JAVA_HOME=/usr/java/default
export PATH=${JAVA_HOME}/bin:${PATH}
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar

3. Compile WordCount.java and create a jar:

$ bin/hadoop com.sun.tools.javac.Main WordCount.java


$ jar cf wc.jar WordCount*.class

 Assuming that:
 /user/joe/wordcount/input - input directory in HDFS
 /user/joe/wordcount/output - output directory in HDFS

4. Sample text-files as input:


$ bin/hadoop fs -ls /user/joe/wordcount/input/ /user/joe/wordcount/input/file01
/user/joe/wordcount/input/file02
$ bin/hadoop fs -cat /user/joe/wordcount/input/file01
Hello World Bye World

$ bin/hadoop fs -cat /user/joe/wordcount/input/file02


Hello Hadoop Goodbye Hadoop

5. Run the application:

$ bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output


Output:

$ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000`


Bye 1
Goodbye 1
Hadoop 2
Hello 2

World 2
Result :
Thus the Word Count program to demonstrate the use of Map and reduce tasks was
successfully completed and output verified.

You might also like