Welcome!

Startup Marketeer and Tech Veteran

Thomas Krafft

Subscribe to Thomas Krafft: eMailAlertsEmail Alerts
Get Thomas Krafft via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by Thomas Krafft

Interesting article at GigaOm: http://bit.ly/OINpfr I won’t repeat the main points – but basically it says that since Hadoop is disk/ETL/batch based it won’t fit for real time processing of frequently changing data. Author correctly points out that real time processing (i.e. perceptual real time meaning sub-second to few seconds response time) is becoming a HUGE trend that’s impossible to ignore. He points to Google that moved away from Hadoop MapReduce-like approach towards massively distributed in-memory platform for its various projects like Precolator and Dremel… So, What’s New?! The widespread confusion about Hadoop’s role and its applicability is becoming alarming… Hadoop was never designed to process anything in real time or process live streaming data or process anything that’s rapidly changing. Hadoop’s core is HDFS technology – a highly scalable distribu... (more)

Big Data Analytics and BI Strategies: Five Words To Avoid

Over the last 12 months I’ve accumulated plenty of “conversations” where we’ve discussed big data analytics and BI strategies with our customers and potential users. These 5 points below represent some of the key take-away points about current state of analytics/BI field, why it is by in large a sore state of affairs and what some of the obvious tell-tale signs of the decay. Beware: some measure of hyperbole is used below to make the points more contrast… “Batch” This is probably getting obvious for the most of industry insiders but still worth while to mention. If you have “b... (more)

Debunking DRAM vs. Flash Controversy vis-a-vis In-Memory Processing

Wikibon produced an interesting material (looks like paid by Aerospike, NoSQL database recently emerged by resurrecting failed CitrusLeaf and acquihiring AlchemyDB, which product, of course, was recommended in the end) that compares NoSQL databases based on storing data in flash-based SSD vs. storing data in DRAM. There are number of factual problems with that paper and I want to point them out. Note that Wikibon doesn’t mention GridGain in this study (we are not a NoSQL datastore per-se after all) so I don’t have any bone in this game other than annoyance with biased and factu... (more)

Micro Cloud in Your JVM: Code Example

Few days ago I blogged about how GridGain easily supports starting many GridGain nodes in the single JVM – which is a huge productivity boost during the development. I’ve got a lot of requests to show the code – so here it is (next page). This is an example that we are shipping with upcoming 4.3 release (entire source code): import org.gridgain.grid.*; import org.gridgain.grid.spi.discovery.tcp.*; import org.gridgain.grid.spi.discovery.tcp.ipfinder.*; import org.gridgain.grid.spi.discovery.tcp.ipfinder.vm.*; import org.gridgain.grid.typedef.*; import javax.swing.*; import java... (more)

GridGain and Hadoop: Differences and Synergies

GridGain is Java-based middleware for in-memory processing of big data in a distributed environment. It is based on high performance in-memory data platform that integrates fast In-Memory MapReduce implementation with In-Memory Data Grid technology delivering easy to use and easy to scale software. Using GridGain you can process terabytes of data, on 1000s of nodes in under a second. GridGain typically resides between business, analytics, transactional or BI applications and long term data storage such as RDBMS, ERP or Hadoop HDFS, and provides in-memory data platform for high p... (more)