Sunday, May 31, 2015

What is the big deal with BigInsights 4.0?

BigInsights offers a one stop shop for Business Analysts and Scientists. IBM BigInsights enhances open source Hadoop with the complete set of capabilities to query, visualize, explore data and conduct distributed machine learning at scale resulting in deeper insight and better actions.

In a nutshell BigInsights 4.0 offers:

1) 100% Open Source Base Components
IBM Open Platform with Apache Hadoop is composed of 100 percent open source components for use in big data analysis. This product offering includes open source components such as Ambari, Hadoop, YARN, Spark, Hive, HBase, Knox, Avro, Flume, Pig, Slider, Sqoop, ZooKeeper, Oozie, Avro, Nagios, and more. Yes Spark is now supported!
IBM Open Platform has incorporated the most recent releases across all of the components of Enterprise Hadoop, including Hadoop 2.6.
2) More Flexible Packaging & Monitoring
With support for Apache Ambari users have a very flexible and intuitive mechanism to install/configure/monitor the services. The packaging also caters to specific profiles of users including:
  • Business Analyst contains Big SQL that  provide ANSI SQL access to data across any system from Hadoop, via JDBC or ODBC - seamlessly whether that data exists in Hadoop or a relational data base. This means that developers familiar with the SQL programming language can access data in Hadoop without having to learn new languages or skills. Also includes BigSheets, IBM’s intuitive spreadsheet and visualization tool to find data quickly and easily.
  • BigInsights Data Scientist contains the analyst module capabilities as well as the following:
    • Web-based Text Analytics tooling that allows users to extract data from unstructured text without writing any code.
    • New machine-learning engine (SystemML) that automatically tunes its performance based on the size of the data.
    • Support for open source R statistical computing helping clients leverage their existing R algorithms and run them across data stored in a Hadoop cluster.
  •  BigInsights Enterprise Management introduces new management tools for clients to realize faster time to results. Designed to help allocate resources and optimize workflows, these tools will allow deployments that can scale to large numbers of users and clusters, and will help satisfy high workload demand. 
 Refer to IBM BigInsights documentation for more.

For a very quick tour and to test run IBM BigInsights you can now download the 'Quick Start Edition'. This offers a free for use VM with the product pre-installed & pre-configured for non-production use and evaluation.

Some quick reference points on QSE:

1) Quick Start Edition Download page: download link

2) Make sure to download both the VM image and also the README .html that has the instructions to boot and login into the VM.

3) The VM desktop would appear as below:

It has some useful shortcuts on the desktop, more details in the README.

4) You will also notice that the Ambari UI will automatically launch in the browser and you would notice each service start sequentially.

5) With Ambari you could easily choose to Add any of the additional services. For instance you would notice Spark is not pre-configured at launch. You could use the 'Add Service' option to add desired optional services/components.

It would show a screen prompting to select additional services/components to be added

Please append your comments and suggestions on this blog or through the community forums @ Hadoop Dev

No comments:

Post a Comment


Predictive Analytics ....... what next?

I have often pondered on this question, wondering what could possibly be the next big quantum leap in the real of data and data centric de...