TalendOpenStudio BigData ReleaseNotes 5.4.0 en

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Talend Open Studio for Big Data

Release Notes

5.4.0

Talend Open Studio for Big Data

Publication date October 28, 2013

Copyleft
This documentation is provided under the terms of the Creative Commons Public License (CCPL). For more information about what you can and cannot do with this documentation in accordance with the CCPL, please read: http://creativecommons.org/licenses/by-nc-sa/2.0/

Notices
All brands, product names, company names, trademarks and service marks are the properties of their respective owners.

Table of Contents
System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Big Data: New Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Demo project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Hadoop file formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3. Kerberos security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 4. NoSQL databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 5. Upgraded support for Hadoop distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 6. In-memory technology . . . . . . . . . . . . . . . . . . . . . . . . . 3 7. Cloud technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 8. File management in HDFS . . . . . . . . . . . . . . . . . . . . . 3 9. Other features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Big Data: Bug Fixes / Change Log . . . . . . . . . . . . . . . . . . . . . . . 4 1. Bug Fixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Big Data: Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1. Revised documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Open issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

System Requirements

System Requirements
Users should refer to the Installation and Upgrade Guides on the Talend Help Center (http://help.talend.com) for more information on Installation and System Requirements.

Big Data: New Features

Big Data: New Features


1. Demo project
1. A Big Data demo project is provided with the Studio. The project includes a number of easy-to-use sample Jobs to help familiarize users with the various features and functions of Talend Studio with Big Data.

2. Hadoop file formats


Support for Sequencefile, RC, ORC and Avro has been added to several components: 1. The tHiveCreateTable and the tHiveLoad components are created. They support not only a wide range of commonly used file formats such as Sequencefile, RC, ORC and Avro, but also the formats that are not officially supported by Talend. 2. In addition to their existing functions, tPigLoad and tPigStoreResult can now process a Sequencefile, RC or Avro file.

3. Kerberos security
1. The Kerberos kinit authentication mode has been enabled for all the Big Data components, including the Hive components. 2. Except to the HBase ones, the Kerberos keytab authentication mode has been added to all the Big Data components.

4. NoSQL databases
1. The following components have been created to enable transactions with their related NoSQL databases: tCassandraBulkLoad, tCassandraOutputBulk, tCassandraBulkExec and tCassandraOutputBulkExec tMongoDBBulkLoad The Riak components 2. The 2.4 and the 2.5 versions of MongoDB are now supported by its related components.

5. Upgraded support for Hadoop distributions


1. New versions of the following Hadoop distributions are supported:

Big Data: New Features

Hortonworks Data Platform 1.3 and 2.0 Cloudera 4.3 and 4.4 MapR 2.1.3 and 3.0.1 2. EMC Pivotal is now available.

6. In-memory technology
1. The newly added SAP Hana components help users easily configure the connection to a SAP Hana system and process transactions with this in-memory computing platform.

7. Cloud technology
1. GS (Google Storage) components are now available for users to prepare their data before transferring the data to Google BigQuery.

8. File management in HDFS


1. The tSqoopMerge component has been created for merging two datasets with newer records overwirting the older ones.. 2. Upgrade of HDFS components The tHDFSCopy component can now merge the part files generated at the end of a MapReduce computation. The input and the output components are enabled to handle header rows. The tHDFSInput component can read sub-directories of a specified directory.

9. Other features
1. Support for OAuth2 security has been added to the Salesforce components. 2. With the addition of support for Amazon S3 (Simple Storage Service), users can use dedicated components to perform transactions with this data storage service. 3. The Vertica components now officially support Vertica 5.1 and Vertica 6.0.

Big Data: Bug Fixes / Change Log

Big Data: Bug Fixes / Change Log


1. Bug Fixes
In addition to the above new features a number of minor improvements within the entire product and significant bug fixes have been made. See the corresponding Change Log on our bug tracking system for more details on the individual issues: https://jira.talendforge.org/secure/ReleaseNote.jspa?projectId=10237&version=14512

Big Data: Known Issues

Big Data: Known Issues


We encourage you to consult the JIRA bug tracking tool for a full list of open issues: https://jira.talendforge.org/secure/IssueNavigator.jspa?requestId=16481

Documentation

Documentation
1. Revised documents
In addition to updates to the content across the documentation set, the following specific documentation changes have been made. Talend Open Studio for MDM User Guide now includes parts describing how to work with the Integration and Profiling perspectives. This information is the same as the information contained in the Talend Open Studio for Data Integration User Guide and the Talend Open Studio for Data Quality User Guide. Talend Big Data Studio Getting Started Guide has been renamed to Talend Big Data Getting Started Guide. A new chapter "Getting started with Talend Big Data using the demo project" has been added to the Talend Big Data Studio Getting Started Guide. This chapter provides short descriptions about the sample Jobs included in the demo project and introduces the necessary preparations to run the sample Jobs on a Hadoop platform. Talend Open Studio for ESB Mediation Components Reference Guide and Talend ESB Mediation Components Reference Guide have been merged into one guide, Talend ESB Mediation Components Reference Guide. In the ESB Getting Started Guide, the chapter "Downloading and installing Talend ESB software" is now called "Getting started with Talend ESB", and the demo chapters are now split into two categories ("Basic deployment and runtime use cases" and "Advanced deployment and runtime use cases with SOA Governance"). In the ESB Infrastructure Services Configuration Guide and the STS User Guide, some conceptual information has been added that was previously found in the ESB Getting Started Guide.

2. Open issues
We encourage you to consult the JIRA bug tracking tool for a full list of open issues: https://jira.talendforge.org/secure/IssueNavigator.jspa?requestId=16490

You might also like