The yearly MusicBrainz summit serves an important function in building our community: we talk about issues facing MusicBrainz and we plan the road map for MusicBrainz projects. The summits are usually scheduled to allow as many people to attend as possible and this year we chose Nürnberg, Germany as our location. MusicBrainz contributor Nikolai "Pronik" Prokoschenko lives in Nürnberg and was our local contract and ended up planning most of the summit.
Pronik found us a conference room that we rented for the entire day, complete with open WiFi, which is important if you plan to have a room full of geeks. He also found us a cheap Gasthof that provided lodgings slightly better than a Hostel for a mere 20€ per person per night — a really good deal for Europe. The evening before the summit we all sat in the Gasthof and were treated to some confusing German/Greek cuisine with some of the most rude service any of us have ever encountered. But, our group is used to dealing with the crude Internet public, so we managed to laugh off the horrible service and still have a great time.
To our luck there was a grocery store right next door to our Gasthof and we commenced another successful crowd sourced breakfast. Four people were each given 20€ with the instructions to buy food/drinks that they would like to eat/drink for breakfast/lunch. No collusion was allowed between people! Once the shopping was complete we walked to the conference room, settled in and dove into the masses of food we'd collected. Many tasty bread rolls with jam, nutella, cold cuts and cheese were consumed. Of course we had fun things like a case of Bionade, juices, tea, gummy bears and chocolate. Crowd sourcing breakfast takes a potentially frustrating chore and makes it fun for everyone.
Plus, Pronik and his mate Kira brought a MusicBrainz decorated cake to celebrate 10 years of MusicBrainz!
As people were eating, we started to collect an unconference-like agenda of what people wanted to talk about. We decided to have a detailed state of the project talk including recent developments from meeting our customers in Europe. We also talked about current development processes and some of the problems associated with these processes. Oliver Charles, a 2008 Google Summer of Code™ student, gave an introduction on how to hack on the MusicBrainz server, based on his work from the last year.
Most of the time was spent discussing new features for once we release our much anticipated Next Generation Schema. At times we managed to get into deep philosophical discussions about what MusicBrainz is and what it should be. At other times we discussed light hearted topics with lots of joking. These summits do wonders for building our community and getting people on the same page. We manage to explore many topics and reach consensus on many points in one day instead of spending weeks on the same discussions online.
Finally, in the evening we cleaned up our space and retired to a local beer hall where we continued the discussion in a less formal manner. If you're interested, we posted all the session notes from the summit on our wiki. All in all, this event was fun and not much effort to put on — thanks to Pronik! On another happy note, 1/3 of the people in attendance were women, which is much better than most tech summits I've attended.
In total we spent about $1500, including all the food, drinks, lodgings and one person's travel costs. For a summit with 12 people, I think we did rather well! I call that Google's support well spent — thanks again for supporting MusicBrainz, Google!
London Open Source Jam 15
Friday, December 18, 2009
On the 3rd of December we held the latest (and greatest) Google London Open Source Jam at our offices near Victoria. The Jam is a way to get like-minded Open Source contributors and users together and give them a chance to give a 5 minute talk on something dear to their hearts, all the while availing themselves of free beer and pizza!
This time's topic was the somewhat catchall: "the Web." Like always, the topic is more of a guide than a rule, so we had some pretty diverse talks.
Our very own Jon Skeet set the evening off to a good start by telling us all about Noda Time — a new Open Source library for handling dates and times in .NET, based on the Joda Time library for Java.
Simon Phillips is a consultant to the film business and gave a great presentation on how he uses Google Wave to help him work closely with directors, script writers, set designers and the like. He showed some great ideas for using Wave in this way and was canvassing for help in developing Open Source Wave robots to help this process.
Simon Stewart gave a rallying cry for making the web more accessible to the blind and deaf, especially in this modern era of HTML canvas and video tags. By ensuring your sites are accessible, you open them up to more users, and as a useful side effect you also make them more testable.
HTTP has started to show its age, and maybe it's time for a leaner, meaner protocol to come along. I took a brief break from my hosting duties to present a summary of SPDY, a chromium.org project to develop a replacement protocol which will deliver data to our browsers faster.
If you run a web site, you may have come to fear the "Slashdot effect" where you are linked from a popular website and get a spike of traffic. Glyn Wintle from the Open Rights Group (ORG) informed us that this is nothing compared to having a bunch of knitting forums link to you! His was a tale of Open Sourcing of knitting patterns and DMCA take-down notices. He also brought us up to speed on the latest from the ORG.
Sam Mbale gave us an update on his work bringing open source to Africa and told us all about BarCamp Lusaka which he'll be attending. We look forward to hearing how it went at another Jam.
Robert Rees gave us an experience report on using Velocity templates to divide responsibilities between engineers and web designers. It seems to work pretty well; contracts are enforced by unit tests, and designers know exactly what primitives they can use when laying out web pages.
Finally, Matt Savage talked about his ideas for RESTful acceptance tests, and Steven Goodwin gave us an update on his project to build a "Wallace and Gromit" house.
You can find more pictures of the event on Picasa Web Albums. To find out more about the Google London Open Source Jam, visit http://osjam.appspot.com/. If you'd like to receive regular updates about future jams, sign up for our mailing list. We hope to see you at future jams!
By Matt Godbolt, Software Engineering Team
This time's topic was the somewhat catchall: "the Web." Like always, the topic is more of a guide than a rule, so we had some pretty diverse talks.
Our very own Jon Skeet set the evening off to a good start by telling us all about Noda Time — a new Open Source library for handling dates and times in .NET, based on the Joda Time library for Java.
Simon Phillips on Google Wave
Simon Phillips is a consultant to the film business and gave a great presentation on how he uses Google Wave to help him work closely with directors, script writers, set designers and the like. He showed some great ideas for using Wave in this way and was canvassing for help in developing Open Source Wave robots to help this process.
Simon Stewart gave a rallying cry for making the web more accessible to the blind and deaf, especially in this modern era of HTML canvas and video tags. By ensuring your sites are accessible, you open them up to more users, and as a useful side effect you also make them more testable.
HTTP has started to show its age, and maybe it's time for a leaner, meaner protocol to come along. I took a brief break from my hosting duties to present a summary of SPDY, a chromium.org project to develop a replacement protocol which will deliver data to our browsers faster.
Glyn Wintle Gets Comfortable
If you run a web site, you may have come to fear the "Slashdot effect" where you are linked from a popular website and get a spike of traffic. Glyn Wintle from the Open Rights Group (ORG) informed us that this is nothing compared to having a bunch of knitting forums link to you! His was a tale of Open Sourcing of knitting patterns and DMCA take-down notices. He also brought us up to speed on the latest from the ORG.
Sam Mbale gave us an update on his work bringing open source to Africa and told us all about BarCamp Lusaka which he'll be attending. We look forward to hearing how it went at another Jam.
Robert Rees gave us an experience report on using Velocity templates to divide responsibilities between engineers and web designers. It seems to work pretty well; contracts are enforced by unit tests, and designers know exactly what primitives they can use when laying out web pages.
Matt Savage on RESTful Acceptance Tests
Finally, Matt Savage talked about his ideas for RESTful acceptance tests, and Steven Goodwin gave us an update on his project to build a "Wallace and Gromit" house.
You can find more pictures of the event on Picasa Web Albums. To find out more about the Google London Open Source Jam, visit http://osjam.appspot.com/. If you'd like to receive regular updates about future jams, sign up for our mailing list. We hope to see you at future jams!
By Matt Godbolt, Software Engineering Team
Rocking the Grid: The Globus Alliance's Second Google Summer of Code
Tuesday, December 15, 2009
The Globus Alliance is a community of organizations and individuals developing fundamental technologies behind the "Grid," which lets people share computing power, databases, instruments, and other on-line tools securely across corporate, institutional, and geographic boundaries without sacrificing local autonomy. We first participated in Google Summer of Code™ in 2008 and we found the experience extremely productive both for the Globus Alliance and the individual mentors, so we wanted to confirm the value of the program for the students who took part. We contacted our eight students from last year to find out what impact Google Summer of Code had on their lives and careers. While many of our students still remembered the experience fondly, and said it was valued highly by prospective employers, there were two students who had particularly remarkable stories.
AliEn Grid Site Dynamic Deployment and Working at CERN
Last year, Artem Harutyunyan, mentored by Tim Freeman, developed a set of scripts on top of Globus Nimbus to dynamically deploy an entire AliEn Grid site (AliEn is the Grid infrastructure which is used by scientists participating in the ALICE experiment at CERN). His collaboration with the CERN and Globus Nimbus folks went beyond his Google Summer of Code work, and resulted in a new framework, called CernVM Co-Pilot, for execution of 'pilot' Grid jobs on cloud resources. His work is currently used in production to run Grid jobs from CERN'S ALICE experiment, and there are plans to extend it for the execution of ATLAS and LHCb jobs. Artem also co-authored two papers on his work: "Dynamic AliEn Grid Sites on Nimbus with CernVM" was presented at the 17th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2009) in Prague, and "Building a Volunteer Cloud", which includes a description of CernVM Co-Pilot, was presented during the Latin American Conference on High Performance Computing in Mérida, Venezuela.
Holder-of-Key Single Sign-On
Joana M. F. Trindade, mentored by Tom Scavo, spent last summer implementing a Holder-of-Key Single Sign-On profile handler for the Shibboleth Identity Provider in Globus GridShib. And, since then, things have just been getting better for her. Thanks to her outstanding summer work, she was offered an appointment as a Visiting Scholar at UIUC, where she worked on researching fault injection in virtual machines with Professor Ravi Iyer. After six months in that position, Joana was offered admission into the masters program at UIUC, where she is currently working with Professor Marianne Winslett. More importantly, Joana tells us that participating in Google Summer of Code gave her a renewed sense of confidence in her research abilities, having previously thought that her academic background was insufficient to gain admission into a top-tier university in the US. Joana tells us that "After Google Summer of Code, I regained that hope, and I must say I'm really happy to have found a topic in Globus to which I could contribute, and that in turn opened so many doors for me."
Congratulations Artem and Joana for all you have achieved!
Lessons Learned
Our first Google Summer of Code last year also had its fair share of challenges, including two students who didn't make it through the program, but it gave us the opportunity to learn a lot about how to mentor and manage summer students. We were fortunate to be selected again this year as a Google Summer of Code mentoring organization, which allowed us to apply everything we learned. First of all, we required students to provide more information about their background and the project they were proposing. Last year our student application form was essentially a blank form saying "Tell us about your project here," so this year we presented prospective students with more specific questions. We also decided to check in with our students more often which, at least in one case, allowed us to identify a problem between a student and a mentor early on, giving us time to deal with it constructively before the midterm.
In the end, applying what we learned during last year's Google Summer of Code and as well as the Mentor Summit had a noticeable effect. We were fortunate to be given ten students to mentor, and all ten students passed. Furthermore, our mentors report that practically all the code written by the students has either already been released or will be released soon. In fact, overall, we felt that this year's students rocked. Here's a summary of their summer work.
Going Beyond a Single Cluster
The Globus Nimbus cloud toolkit allows you to turn your cluster into an Infrastructure-as-a-Service (IaaS) cloud. However, it was mainly geared towards managing a single cluster. Not any more! Adam Bishop, mentored by Ian Gable, worked hard over the summer to add new components enabling multiple cluster support for Nimbus. He developed a series of production-quality plugins, which have already been committed to the Nimbus source repository, that publish the state of Nimbus cluster back to a Globus MDS Registry. This allows the availability of cloud resources across multiple Nimbus clusters to be gathered together into a single registry, which is the first step towards adding cross-cluster support to Nimbus.
Spilling Over Multiple Clusters
Another student, Jan-Philip Gehrcke, mentored by Kate Keahey, also spent the summer with his head in the clouds, but in a good way: he developed the Clobi project, a job scheduling system supporting virtual machines (VMs) in multiple IaaS clouds, with support for Globus Nimbus and Amazon EC2 clouds. In a nutshell, there are many scientific applications that are typically run as "jobs" on a compute cluster. Jan-Philip's project allows these jobs to be submitted to a cloud instead of to a traditional compute cluster. The most interesting use case is when a site operates a Globus Nimbus cloud and, during peaks in demand for computational capacity, extends its capacity momentarily by spilling the jobs over to a second (or third, or fourth, ...) cloud such as Amazon EC2. Although Clobi is not tied to any particular application (its design is generic and should be useful whenever it’s convenient to distribute jobs across different clouds), the motivating application for Clobi is ATLAS Computing (for the LHC's ATLAS experiment at CERN). In fact, by the end of the summer, Jan-Philip was able to run a common ATLAS Computing application (the so-called “full chain”) successfully with Clobi. If you want more details about Clobi, check out this blog post written by Jan-Philip.
Incremental GridFTP Transfers
Enough about clouds, let's move on to the exciting topic of data. Globus GridFTP is a high-performance, secure, reliable data transfer protocol that is pretty good at moving data. Fast. Of course, there's always someone who wants to go even faster, like Shruti Jain, mentored by Michael Link. Shruti took globus-url-copy, the GridFTP client, and added a 'sync' feature that allows a local and remote file to be synchronized, by sending only the changed sections of the file. This results in more effective bandwidth utilization by avoiding redundant data transfers.
Checksummed GridFTP Transfers
Remember Mattias Lidman? We certainly do. In last year's Google Summer of Code, he developed a compression driver for the Globus XIO input/output library (which GridFTP depends on) to compress/uncompress data as it passes through it. However, although moving data faster is all good and well, it's not worth much if it somehow gets corrupted in-flight. So this year, Mattias, mentored by Joseph Bester, continued to work on Globus XIO and developed a Checksum Driver. Mattias's driver checksums GridFTP data streams allowing both ends of a GridFTP transfer to verify the integrity of the data.
CQL Queries Builder
You know one really cool thing grids are used for? Cancer research. The Cancer Biomedical Informatics Grid, or caBIG®, is an information network enabling all constituencies in the cancer community – researchers, physicians, and patients – to share data and knowledge. caGrid is the underlying service-oriented infrastructure that supports caBIG, and it relies heavily on the Globus Toolkit. Some of the data services in this architecture use a query language called CQL that is, well... complicated. To make life easier for scientists, Monika Machunik, mentored by Wei Tan, wrote a plug-in for Taverna (an open source tool used by scientists to design and execute workflows) for constructing CQL queries, allowing scientists to focus on their work rather than on the intricacies of the CQL language.
GridWay-Google Maps Mashup
Grids require coordinating resources across multiple organizations, and the Globus GridWay meta-scheduler is a great tool to do just that. However, coordinating hundreds or even thousands of machines across dozens of sites can get a bit messy using the console-based tools included with GridWay. Carlos Martín, mentored by Alejandro Lorca, tackled this problem by creating an interactive GridWay-Google Maps mashup, allowing the administrators and users of a GridWay installation to get a quick snapshot of the status of multiple sites and the jobs running in them, as shown in this screenshot:
Carlos used the Google Web Toolkit to develop this application, which is totally decoupled from GridWay, making it easy to install it alongside existing installations of GridWay. In fact, you can download the GridWay+Google Maps application and check out its documentation, including more screenshots, at the application's page on the GridWay site.
GridWay GUI
Srinivasan Natarajan, mentored by Jose Luis Vazquez-Poletti, worked on a more administration-oriented GUI for GridWay, allowing users to compose, manage and control their jobs instead of using the command line interface. This GUI includes a host of other features, such as host and user monitoring, filtering account statistics and execution history information, and support for processing DAGMan workflows, including visualizing dependencies between jobs in the workflow.
Both of the GridWay projects were presented in several sessions, including one on nuclear fusion, at the EGEE'09 conference in Barcelona, Spain back in September.
GridFTP Benchmarking
How about we get back to the subject of data management? The recent addition of UDT (UDP Data Transfer) support to GridFTP has made even faster transfer speeds possible. You guessed it: here's another student who couldn't resist the need for speed this summer. Jamie Schwettmann, mentored by Raj Kettimuthu, sought to characterize the performance of GridFTP over 10Gb/s networks, specifically to measure the speed increase given by UDT as compared to TCP transfers, as well as a number of other considerations such as CPU and memory overhead at both ends of the transfer. In doing so, they decided to develop an automated GridFTP benchmarking and throughput optimization utility called globus-transfer-test, which takes URL pairs from a list or on the command line, and allows for varying input parameters such as parallelism level, transfer type (memory-to-memory, disk-to-disk, etc), TCP Buffer Sizes, MTU sizes, and all other standard globus-url-copy options (except multicasting) and when possible, compares with other performance and throughput utilities such as iperf or scp. Designed for general use by users or administrators as well as to carry out our performance characterization, globus-transfer-test aims to provide enough information to optimize GridFTP options for maximizing throughput between grid sites. This common need has allowed collaboration with many other projects and organizations in the course of development and testing, including the US ATLAS Project, TeraGrid, and OSCER. Jamie even presented a poster on her project at the 2009 Oklahoma Supercomputing Symposium.
AJAX Framework for Globus Web Services
Many of the components in Globus are web services, which are not exactly human-readable creatures. Fugang Wang, mentored by Tom Howe, developed a JavaScript API that enables accessing Globus services from a web client using AJAX. Fugang's framework, which includes a backend service that mediates service requests to the Globus toolkit and an AJAX web client to access this services, makes life easier for Globus developers and users by allowing them to interact with Globus services from the comfort of their web browsers.
Secure Cloud Communications
And we'll end with the ever-popular subject of data management. Melissa Weaver, mentored by John Bresnahan, developed a PSK driver for Globus XIO. She first developed a program that, using OpenSSL libraries to encrypt and decrypt data using a stream or block cipher of the user's choice, allowed her to experiment with different lengths of keys and initialization vectors and different file sizes to make performance measurements. Then, she developed the XIO PSK driver itself, which used the results of the first program to implement an RC2 block cipher to ensure any communication between computers, once a connection has been set up, is secure.
High energy physics experiments at CERN! Cancer research! Nuclear fusion! Cloud computing! Fast data transfers! Oh my! Oodles of congratulations to our mentors and students for all their hard work and for making this such an awesome Google Summer of Code for the Globus Alliance!
By Borja Sotomayor, Ph.D. Candidate, University of Chicago and Google Summer of Code Organization Administrator
AliEn Grid Site Dynamic Deployment and Working at CERN
Last year, Artem Harutyunyan, mentored by Tim Freeman, developed a set of scripts on top of Globus Nimbus to dynamically deploy an entire AliEn Grid site (AliEn is the Grid infrastructure which is used by scientists participating in the ALICE experiment at CERN). His collaboration with the CERN and Globus Nimbus folks went beyond his Google Summer of Code work, and resulted in a new framework, called CernVM Co-Pilot, for execution of 'pilot' Grid jobs on cloud resources. His work is currently used in production to run Grid jobs from CERN'S ALICE experiment, and there are plans to extend it for the execution of ATLAS and LHCb jobs. Artem also co-authored two papers on his work: "Dynamic AliEn Grid Sites on Nimbus with CernVM" was presented at the 17th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2009) in Prague, and "Building a Volunteer Cloud", which includes a description of CernVM Co-Pilot, was presented during the Latin American Conference on High Performance Computing in Mérida, Venezuela.
Holder-of-Key Single Sign-On
Joana M. F. Trindade, mentored by Tom Scavo, spent last summer implementing a Holder-of-Key Single Sign-On profile handler for the Shibboleth Identity Provider in Globus GridShib. And, since then, things have just been getting better for her. Thanks to her outstanding summer work, she was offered an appointment as a Visiting Scholar at UIUC, where she worked on researching fault injection in virtual machines with Professor Ravi Iyer. After six months in that position, Joana was offered admission into the masters program at UIUC, where she is currently working with Professor Marianne Winslett. More importantly, Joana tells us that participating in Google Summer of Code gave her a renewed sense of confidence in her research abilities, having previously thought that her academic background was insufficient to gain admission into a top-tier university in the US. Joana tells us that "After Google Summer of Code, I regained that hope, and I must say I'm really happy to have found a topic in Globus to which I could contribute, and that in turn opened so many doors for me."
Congratulations Artem and Joana for all you have achieved!
Lessons Learned
Our first Google Summer of Code last year also had its fair share of challenges, including two students who didn't make it through the program, but it gave us the opportunity to learn a lot about how to mentor and manage summer students. We were fortunate to be selected again this year as a Google Summer of Code mentoring organization, which allowed us to apply everything we learned. First of all, we required students to provide more information about their background and the project they were proposing. Last year our student application form was essentially a blank form saying "Tell us about your project here," so this year we presented prospective students with more specific questions. We also decided to check in with our students more often which, at least in one case, allowed us to identify a problem between a student and a mentor early on, giving us time to deal with it constructively before the midterm.
In the end, applying what we learned during last year's Google Summer of Code and as well as the Mentor Summit had a noticeable effect. We were fortunate to be given ten students to mentor, and all ten students passed. Furthermore, our mentors report that practically all the code written by the students has either already been released or will be released soon. In fact, overall, we felt that this year's students rocked. Here's a summary of their summer work.
Going Beyond a Single Cluster
The Globus Nimbus cloud toolkit allows you to turn your cluster into an Infrastructure-as-a-Service (IaaS) cloud. However, it was mainly geared towards managing a single cluster. Not any more! Adam Bishop, mentored by Ian Gable, worked hard over the summer to add new components enabling multiple cluster support for Nimbus. He developed a series of production-quality plugins, which have already been committed to the Nimbus source repository, that publish the state of Nimbus cluster back to a Globus MDS Registry. This allows the availability of cloud resources across multiple Nimbus clusters to be gathered together into a single registry, which is the first step towards adding cross-cluster support to Nimbus.
Spilling Over Multiple Clusters
Another student, Jan-Philip Gehrcke, mentored by Kate Keahey, also spent the summer with his head in the clouds, but in a good way: he developed the Clobi project, a job scheduling system supporting virtual machines (VMs) in multiple IaaS clouds, with support for Globus Nimbus and Amazon EC2 clouds. In a nutshell, there are many scientific applications that are typically run as "jobs" on a compute cluster. Jan-Philip's project allows these jobs to be submitted to a cloud instead of to a traditional compute cluster. The most interesting use case is when a site operates a Globus Nimbus cloud and, during peaks in demand for computational capacity, extends its capacity momentarily by spilling the jobs over to a second (or third, or fourth, ...) cloud such as Amazon EC2. Although Clobi is not tied to any particular application (its design is generic and should be useful whenever it’s convenient to distribute jobs across different clouds), the motivating application for Clobi is ATLAS Computing (for the LHC's ATLAS experiment at CERN). In fact, by the end of the summer, Jan-Philip was able to run a common ATLAS Computing application (the so-called “full chain”) successfully with Clobi. If you want more details about Clobi, check out this blog post written by Jan-Philip.
Incremental GridFTP Transfers
Enough about clouds, let's move on to the exciting topic of data. Globus GridFTP is a high-performance, secure, reliable data transfer protocol that is pretty good at moving data. Fast. Of course, there's always someone who wants to go even faster, like Shruti Jain, mentored by Michael Link. Shruti took globus-url-copy, the GridFTP client, and added a 'sync' feature that allows a local and remote file to be synchronized, by sending only the changed sections of the file. This results in more effective bandwidth utilization by avoiding redundant data transfers.
Checksummed GridFTP Transfers
Remember Mattias Lidman? We certainly do. In last year's Google Summer of Code, he developed a compression driver for the Globus XIO input/output library (which GridFTP depends on) to compress/uncompress data as it passes through it. However, although moving data faster is all good and well, it's not worth much if it somehow gets corrupted in-flight. So this year, Mattias, mentored by Joseph Bester, continued to work on Globus XIO and developed a Checksum Driver. Mattias's driver checksums GridFTP data streams allowing both ends of a GridFTP transfer to verify the integrity of the data.
CQL Queries Builder
You know one really cool thing grids are used for? Cancer research. The Cancer Biomedical Informatics Grid, or caBIG®, is an information network enabling all constituencies in the cancer community – researchers, physicians, and patients – to share data and knowledge. caGrid is the underlying service-oriented infrastructure that supports caBIG, and it relies heavily on the Globus Toolkit. Some of the data services in this architecture use a query language called CQL that is, well... complicated. To make life easier for scientists, Monika Machunik, mentored by Wei Tan, wrote a plug-in for Taverna (an open source tool used by scientists to design and execute workflows) for constructing CQL queries, allowing scientists to focus on their work rather than on the intricacies of the CQL language.
GridWay-Google Maps Mashup
Grids require coordinating resources across multiple organizations, and the Globus GridWay meta-scheduler is a great tool to do just that. However, coordinating hundreds or even thousands of machines across dozens of sites can get a bit messy using the console-based tools included with GridWay. Carlos Martín, mentored by Alejandro Lorca, tackled this problem by creating an interactive GridWay-Google Maps mashup, allowing the administrators and users of a GridWay installation to get a quick snapshot of the status of multiple sites and the jobs running in them, as shown in this screenshot:
Carlos used the Google Web Toolkit to develop this application, which is totally decoupled from GridWay, making it easy to install it alongside existing installations of GridWay. In fact, you can download the GridWay+Google Maps application and check out its documentation, including more screenshots, at the application's page on the GridWay site.
GridWay GUI
Srinivasan Natarajan, mentored by Jose Luis Vazquez-Poletti, worked on a more administration-oriented GUI for GridWay, allowing users to compose, manage and control their jobs instead of using the command line interface. This GUI includes a host of other features, such as host and user monitoring, filtering account statistics and execution history information, and support for processing DAGMan workflows, including visualizing dependencies between jobs in the workflow.
Both of the GridWay projects were presented in several sessions, including one on nuclear fusion, at the EGEE'09 conference in Barcelona, Spain back in September.
GridFTP Benchmarking
How about we get back to the subject of data management? The recent addition of UDT (UDP Data Transfer) support to GridFTP has made even faster transfer speeds possible. You guessed it: here's another student who couldn't resist the need for speed this summer. Jamie Schwettmann, mentored by Raj Kettimuthu, sought to characterize the performance of GridFTP over 10Gb/s networks, specifically to measure the speed increase given by UDT as compared to TCP transfers, as well as a number of other considerations such as CPU and memory overhead at both ends of the transfer. In doing so, they decided to develop an automated GridFTP benchmarking and throughput optimization utility called globus-transfer-test, which takes URL pairs from a list or on the command line, and allows for varying input parameters such as parallelism level, transfer type (memory-to-memory, disk-to-disk, etc), TCP Buffer Sizes, MTU sizes, and all other standard globus-url-copy options (except multicasting) and when possible, compares with other performance and throughput utilities such as iperf or scp. Designed for general use by users or administrators as well as to carry out our performance characterization, globus-transfer-test aims to provide enough information to optimize GridFTP options for maximizing throughput between grid sites. This common need has allowed collaboration with many other projects and organizations in the course of development and testing, including the US ATLAS Project, TeraGrid, and OSCER. Jamie even presented a poster on her project at the 2009 Oklahoma Supercomputing Symposium.
AJAX Framework for Globus Web Services
Many of the components in Globus are web services, which are not exactly human-readable creatures. Fugang Wang, mentored by Tom Howe, developed a JavaScript API that enables accessing Globus services from a web client using AJAX. Fugang's framework, which includes a backend service that mediates service requests to the Globus toolkit and an AJAX web client to access this services, makes life easier for Globus developers and users by allowing them to interact with Globus services from the comfort of their web browsers.
Secure Cloud Communications
And we'll end with the ever-popular subject of data management. Melissa Weaver, mentored by John Bresnahan, developed a PSK driver for Globus XIO. She first developed a program that, using OpenSSL libraries to encrypt and decrypt data using a stream or block cipher of the user's choice, allowed her to experiment with different lengths of keys and initialization vectors and different file sizes to make performance measurements. Then, she developed the XIO PSK driver itself, which used the results of the first program to implement an RC2 block cipher to ensure any communication between computers, once a connection has been set up, is secure.
High energy physics experiments at CERN! Cancer research! Nuclear fusion! Cloud computing! Fast data transfers! Oh my! Oodles of congratulations to our mentors and students for all their hard work and for making this such an awesome Google Summer of Code for the Globus Alliance!
By Borja Sotomayor, Ph.D. Candidate, University of Chicago and Google Summer of Code Organization Administrator