Explorar >   IBM PartnerWorld
 |  |  |  |  | 
Traducción
中文(简体)中文(漢字)EnglishFrançaisItaliano日本語한국어PortuguêsРусскийEspañol
por Transposh - Plugin de traducción para WordPress

Un widget de traducción se proporciona para que su conveniencia facilitar la traducción de la versión en Inglés de este blog en varios idiomas. Si usted decide utilizar este servicio de traducción automática, por favor, entienda que puede haber desviaciones entre la traducción automática y la versión original en inglés. IBM no es responsable de tales desviaciones de traducción automática y ofrece la versión traducida "COMO ES" sin garantías de ningún tipo.

ANALYTICS Generación de demanda Infraestructura En todo el mundo

La respuesta al desafío de grandes volúmenes de datos

La respuesta al desafío de grandes volúmenes de datos

La construcción de un escalable, cúmulo de datos de alto rendimiento para servir análisis de datos grandes

Big data analytics projects inevitably begin with big hopes and grand plans. Getting started with Hadoop and Spark is straightforward. Pilot projects start with open source tools, sample data and a modest goal. Pilot success might be a single view of previously independent data that allows end-to-end reporting of a customer or a process. And then the fun begins. Real data. Regular reports. Scaling the cluster.

The weakest link and the easiest issue to address is storage. The default open source data storage, Hadoop Distributed File System (HDFS), was not designed for the enterprise. Por ejemplo, the data organizations want to analyze is almost always from other sources. It may have client information that needs to be secured and access-controlled. Inevitably, other applications or users also want to use the same data as the big data cluster using industry-standard file or object interfaces.

The solution is to build a scalable, high-performance data cluster to serve big data analytics that also supports industry-standard protocols. With complete HDFS support and the scalable performance of the leading parallel file system, Servidor IBM elástico de almacenamiento (ESS) 5.2 is the perfect building block for big data analytics storage. Built with Escala de IBM Espectro, the HDFS-transparent connector enables open source Hadoop and Spark frameworks to run without any modification. En realidad, Hortonworks recently paper-certified IBM Spectrum Scale across its portfolio.

The real challenge for IBM Business Partners will be building the business case for ESS 5.2 and IBM Spectrum Scale with the three key stakeholders in a big data analytics project.

The data scientist on the core pilot team will resist any divergence from the open source choices because he or she fears the solution will not perform as the cluster scales. Proven on massive clusters, the IBM Spectrum Scale parallel file system removes the data bottlenecks common to other solutions. It can outperform HDFS on many benchmarks. Sin embargo, it is the elimination of the data ingest transformation and extraction time that will greatly speed the time to insight and convince data scientists to really look at IBM Spectrum Scale and ESS.

For the IT department that needs to support the environment, choosing ESS can lower both the CapEx and the OpEx of the solution. Because ESS uses advanced erasure encoding to distribute and protect data, ESS data storage requires only about 22 percent more physical storage than data. In contrast, HDFS uses three-way replication—300 percent of the data being analyzed. Además, ESS is architected to survive multiple failures and secure data integrity. Redundant data paths and end-to-end checksums make most issues strictly a background task to repair, not an emergency. The ESS GUI provides a complete view of hardware and software, and integrates into IBM Spectrum Control for a portfolio view of storage and trends.

The business-line executive sponsoring the project will probably be aware of the security and governance of the data and the results that open source data storage does not provide. Compliance with privacy and regulations often requires the ability to audit a fraud, risk or compliance result with archives. These are trivial for IBM Spectrum Scale systems, which are supported by IBM Spectrum Protect and most major backup solutions.

Sin embargo, it may be the vision of the organization’s big data future that an executive will find most compelling. IBM Spectrum Scale and the Hadoop Connector can federate multiple data sources into a single HDFS view. It can span geographies for global collaboration. Más, it can automatically tier to tape, on-premises object storage or the cloud to truly archive and analyze in place.

Escala de IBM Espectro, especially the ESS 5.2 all-flash solution, is a perfect reason to discuss the roadmap for big data analytics with your clients. They may still be in the pilot stage, but you will be ready for them when they move from sandbox to production. Puede que me haga saber lo que piensas utilizando los comentarios a continuación cuentan.

Doug O’Flaherty
Gerente, Espectro de IBM Soluciones de Marketing

Douglas O'Flaherty leads the IBM Spectrum Solutions marketing teams, que incluye las carteras de IBM e IBM espectro de almacenamiento Espectro Informática. Su formación incluye tanto a las grandes empresas y los arranques y él ha estado con IBM desde 2015. Señor. O'Flaherty es un evangelista de largo plazo para HPC y grandes volúmenes de datos en aplicaciones comerciales.

Relacionados con la Artículos

PUBLICA TUS COMENTARIOS

Su dirección de correo electrónico no será publicado. Campos requeridos están marcados *

Nombre *

Correo electrónico *

Sitio web