In a nutshell BigInsights 4.0 offers:
1) 100% Open Source Base Components
IBM Open Platform with Apache Hadoop is composed of 100 percent open source components for use in big data analysis. This product offering includes open source components such as Ambari, Hadoop, YARN, Spark, Hive, HBase, Knox, Avro, Flume, Pig, Slider, Sqoop, ZooKeeper, Oozie, Avro, Nagios, and more. Yes Spark is now supported!
IBM Open Platform has incorporated the most recent releases across all of the components of Enterprise Hadoop, including Hadoop 2.6.
2) More Flexible Packaging & Monitoring
With support for Apache Ambari users have a very flexible and intuitive mechanism to install/configure/monitor the services. The packaging also caters to specific profiles of users including:
- Business Analyst contains Big SQL that provide ANSI SQL access to data across any system from Hadoop, via JDBC or ODBC - seamlessly whether that data exists in Hadoop or a relational data base. This means that developers familiar with the SQL programming language can access data in Hadoop without having to learn new languages or skills. Also includes BigSheets, IBM’s intuitive spreadsheet and visualization tool to find data quickly and easily.
- BigInsights Data Scientist contains the analyst module capabilities as well as the following:
- Web-based Text Analytics tooling that allows users to extract data from unstructured text without writing any code.
- New machine-learning engine (SystemML) that automatically tunes its performance based on the size of the data.
- Support for open source R statistical computing helping clients leverage their existing R algorithms and run them across data stored in a Hadoop cluster.
- BigInsights Enterprise Management introduces new management tools for clients to realize faster time to results. Designed to help allocate resources and optimize workflows, these tools will allow deployments that can scale to large numbers of users and clusters, and will help satisfy high workload demand.
For a very quick tour and to test run IBM BigInsights you can now download the 'Quick Start Edition'. This offers a free for use VM with the product pre-installed & pre-configured for non-production use and evaluation.
Some quick reference points on QSE:
1) Quick Start Edition Download page: download link
2) Make sure to download both the VM image and also the README .html that has the instructions to boot and login into the VM.
3) The VM desktop would appear as below:
4) You will also notice that the Ambari UI will automatically launch in the browser and you would notice each service start sequentially.
5) With Ambari you could easily choose to Add any of the additional services. For instance you would notice Spark is not pre-configured at launch. You could use the 'Add Service' option to add desired optional services/components.
It would show a screen prompting to select additional services/components to be added
Please append your comments and suggestions on this blog or through the community forums @ Hadoop Dev