Articles, Tutorials and Talks
A tale of troubleshooting database performance, with Cassandra and sysdig
In this article, I'll show you a Cassandra performance issue that we recently dealt with, and I want to cover how we spotted the problem, what kind of troubleshooting we did to better understand it, and how we eventually solved it.
RDBMS & Graphs: Drivers for Connecting to a Graph Database
This week, we'll discuss the language drivers and APIs specific to Neo4j with plenty of resources for further exploration. At this point, if you are curious about other, non-Neo4j graph databases, we encourage you to explore the available drivers within their respective communities.
DynamoDB Design Patterns and Best Practices
In this talk, we'll walk you through common NoSQL design patterns for a variety of applications to help you learn how to design a schema, store, and retrieve data with DynamoDB. We will discuss best practices with DynamoDB to develop IoT, AdTech, and gaming apps.
Redis Performance Monitoring with the ELK Stack
This post looks at how you can do Redis performance monitoring using the ELK Stack to ship, analyze, and visualize the data.
Analyze a Time Series in Real Time with AWS Lambda, Amazon Kinesis and Amazon DynamoDB Streams
This post explains how to perform time-series analysis on a stream of Amazon Kinesis records, without the need for any servers or clusters, using AWS Lambda, Amazon Kinesis Streams, Amazon DynamoDB and Amazon CloudWatch. We demonstrate how to do time-series analysis on live web analytics events stored in Amazon Kinesis Streams and present the results in near real-time for use cases like live key performance indicators, ad-hoc analytics, and quality assurance, as used in our AWS-based data science and analytics RAVEN (Reporting, Analytics, Visualization, Experimental, Networks) platform at JustGiving.
Churn Prediction with PySpark using MLlib and ML Packages
The prediction process is heavily data driven and often utilizes advanced machine learning techniques. In this post, we'll take a look at what types of customer data are typically used, do some preliminary analysis of the data, and generate churn prediction models - all with PySpark and its machine learning frameworks.
SASIIndex, or "SASI" for short, is an implementation of Cassandra's Index interface that can be used as an alternative to the existing implementations. This post goes on describe how to get up and running with SASI, demonstrates usage with examples, and provides some details on its implementation.
Building an Async Networking Layer for mongos
How to design an HBase data model for recommendations
Getting Started with Couchbase and Spark on Apache Zeppelin
Strong consistency in Manhattan
Financial Markets are Graphs
Using Neo4j to Take Us to the Stars
Spark: Big Data Cluster Computing in Production
"Spark: Big Data Cluster Computing in Production" goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big-data clustering in production. Written by an expert team well-known in the big data community, this book walks you through the challenges in moving from proof-of-concept or demo Spark applications to live Spark in production. Real use cases provide deep insight into common problems, limitations, challenges, and opportunities, while expert tips and tricks help you get the most out of Spark performance. Coverage includes Spark SQL, Tachyon, Kerberos, ML Lib, YARN, and Mesos, with clear, actionable guidance on resource scheduling, db connectors, streaming, security, and much more.
Interesting Projects, Tools and Libraries
Replicate your Key Value Store across your network, with consistency, persistance and performance.
A mail capture and archival server for RethinkDB.
DalmatinerDB is a metric database written in pure Erlang. It takes advantage of some special properties of metrics to make some tradeoffs. The goal is to make a store for metric data (time, value of a metric) that is fast, has a low overhead, and is easy to query and manage.
HDocDB is a client layer for using HBase as a store for JSON documents. It implements many of the interfaces in the OJAI framework.
Upcoming Events and Webinars
Webinar: Continuous Applications: Spark, Kafka, Beam, and Beyond
After this webcast, you'll be able to:
Webinar: Low-latency ingestion and analytics with Apache Kafka and Apache Apex (Hadoop)
This talk will cover a fully fault tolerant, scalable, and operational ingestion from Kafka using Apache Apex application, running natively in Hadoop. The talks will deep dive into technical details of the connectors in Apache Malhar. Details of production use cases will also be discussed.
Webinar: Data Modeling, Data Querying, and NoSQL: A Deep Dive
Attend and learn:
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
In this webinar, we will cover the architecture needed to use the BI Connector with MongoDB. We will also demonstrate how to build reports with your data.
Budite prvi koji će ostaviti komentar.
© Sva prava pridržana, Kompjuter biblioteka, Beograd, Obalskih radnika 4a, Telefon: +381 11 252 0 272