Big Data Analysis vs. Big Picture Analysis

I’m sure you have heard the buzzwords around Big Data and Big Data itself. Companies and governments are gathering lots and lots of data about lots of things. Big Data analysis is trying to make sense of this mountain of data and let people make intelligent decisions. It is this Big Data analysis itself I have a problem with. To be more specific, the problem I see is when people are trying to do Big Data analysis without seeing the big picture first. I guess I would call this a “Big Picture Analysis” when you do have all the data at hand but also the reasons “why” you have so much data in the first place.

Let me explain.

Say you have a system or maybe even many computer systems that generate data that you want to analyze at some point in the future.  You may or may not know how you want to analyze all this stuff but you do know that it might come in handy, one day. So, you store a ton of information. By that, I mean the system stores a ton of information into database, log files, etc. Most of the time, you don’t let the system delete anything. You just let the system gather more and more information because storage space is cheap in AWS.

Let’s assume you decided to look inside this mountain of data because you, or the business rather, can take advantage of this data to learn more about your customers and hopefully sell more products and/or services this way.

When you look inside your mountain of data using the latest Big Data analysis tools, you discover certain facts and statistics. You gather, you sum up things, and divide, you formulate, you massage the data, and so on. At some point, you will need to put these analysis results into some form of presentation that can be further used to make decisions. This can be reports, dashboards, etc.

Now, here comes the crux. With all this mountain of data, how can you be certain why you have all this data in the first place? I mean, why did your system(s) store all this data? Obviously, it stored all this data because it was designed to store all this data into databases etc. But, i’m trying to get you to see this from a business point of view. If you have modeled your system based on the domain, then you should be somewhat familiar with the data that was stored in the database. When you have a domain expert look at some of the data, that person might see certain indicators of what this data is about. Or, that person might have no clue even though that person is a business expert, a domain expert.

My point is that when you look at Big Data you should also look at the reasons why this Big Data exists. Only then, you can make a full connection and see the “Big Picture”. When you see the cause for the Big Data to exist, you can make better assumptions and conclusions after you have completed the analysis. You will be able to follow the “thread” from start to end. When you create a report after you have completed your Big Data analysis, you should also see the causes side by side on that report. Only then, you can see the Big Picture.

So, how do you do that? If you are a big fan of domain driven design (DDD), then you are almost there. When you model a domain you also model domain events (most of the time). Domain events reflect a significant event that has happened inside your domain. The past tense is important here. Things have happened already. Domain events capture these events and let you store that these significant domain events have happened. This is when things get very exciting. Imagine what you can do here. Your domain model not only operates on the business domain but also allows you to record of anything interesting that was triggered for business reasons. When you take a look at your recorded domain events at certain dates and times, you can connect your mountain of data and the reasons why this mountain of data was born. The domain events are the reasons why you have so much data. Your reports can reflect and show this connection between domain events and stored data.

At the end, your analysis just received a significant confirmation and validation of having more accurate information. This leads to an even better understanding of the data and ultimately making smarter decisions for those who need this information.

Microservices by Martin Fowler

In the last 6 months or so, I’ve become a huge fan of Microservices. I love the concept because Microservices are a fantastic extension of domain driven design. One of the core behaviors of Microservices is that no other process is allowed to talk to the data it contains directly (in a database, for example). Any process that needs access to this data, needs to talk to the Microservice. This is exactly what a business entity (aggregate or not) in a domain model is doing. This is what encapsulation in object orientation is all about. To be more specific, a typical domain object has behavior and data, as you know. A domain object will not (and should not) let any other object or process access its data until it passes through its behavior(s). That’s why domain objects (entities) have clearly defined interfaces to talk to.

Now, if you scale this behavior up into components that run in their own processes, these microservices are doing the same thing but from a component point of view: “No other object running in another process is allowed to access another process’s data unless it talks to the behavior of this other component, the other microservice”. This has been my observation and I really have become to like the concept of Microservices. Microservices are a fantastic way of building cloud-based software and with the right PaaS solution, you can create Microservices on premise and simply deploy to a PaaS that also runs on premise or in a public cloud such as Cloud Foundry or AWS, for example.

I’m a huge fan of Martin Fowler. It is not “official” until Martin Fowler has written or spoken about any subject in the Software Industry. Anyway, here is Martin Fowler’s awesome 25 minute explanation of Microservices.

Updated C# Reference Implementation

I have updated my C# reference implementation and included FluentValidation on some of the DTO objects. I also updated the ErrorMap to include validations on the server side as well as on the WPF client side. This version also includes a sample SQL Server Persistence Provider. As always, you can get the latest code on my GitHub repo.

Slides for Sacramento .NET User Group

Update 04-05-2015: I completed the SQL Server Provider. Latest code is on GitHub.

I had a lot of fun presenting last night at the Sacramento .NET user group. It was great to hear that people learned a lot and are looking forward in incorporating the things they have learned about the Provider Model design pattern and object persistence in general into their own projects. The slides are available for download here. The source code of the entire reference implementation is available here. I will be finishing up the SQL Server provider within the next few days and make it available in my GitHub repo.

Object Persistence Reference Implementation

I’ve been updating my reference implementation in the last few days. I’m actually using this reference implement in my own projects. You can download the latest version on my GitHub repo.

This is a complete .NET C# reference implementation to help you jump start a service oriented system running in a cloud environment such as Amazon’s EC2 or on-premis clusters.

This reference implementation shows you how to build a client and the server side. The client side is a sample WPF application that communicates via http REST requests using JSON payloads to the service side. Of course, you can use any type of client as long as the client can communicate via http and REST based JSON’s.

The service side is using a Web API 2 service layer that communicates to a central domain model. The service side demonstrates how to handle exceptions and edge cases and how to communicate failure to the client.

The persistence layer demonstrates the extreamly powerful provider pattern to store the domain objects into the following databases:

  1. db4o (an object database)
  2. Redis (a NoSQL database)
  3. SimpleDB (a NoSQL database)
  4. SQL Server (comming soon)

Please note that the entire system has no knowledge on how the objects are stored. All implementation details are in the individual providers listed above. This means that you can switch the persistence provider without having to recompile and therefore switch a running system from one persistence store to another.

I will try to create a sample SQL Server provider soon.

Visual MASM IDE 1.0 is Released

I’m prouVisualMASM_1_0-Maind to announce the first release of my Visual MASM IDE 1.0. With Visual MASM, you can program assembly applications for Windows 32-bit, Windows 64-bit, and even MS-DOS 16-bit COM and EXEs. Visual MASM uses Microsoft’s powerful Macro Assembler but makes it easier to manage all of your Windows assembly programs. I have included simply Hello World applications that range in size from 254 bytes (COM file) to a whopping 2,5 Kbytes for a Windows 32-bit and Windows 64-bit application. Head on over to and download it, it’s free.

appsworld North America 2015 at Moscone Center West, San Francisco


I’m a confirmed speaker at the appsworld North America 2015 at Moscone Center West, May 12-13 in San Francisco, CA.  Discover the future of multi-platform apps. See all confirmed speakers. This sure will be an exciting event. I’m still working on my presentation that will include Redis, Amazon AWS, C#, and more. I will see you there.

Intel Galileo Gen 2 and Windows 10

Intel Galileo Gen 2

Intel Galileo Gen 2

As there were not enough reasons already to write Windows applications in assembly (see my Visual MASM project site), I just found out yesterday about some great new development in the SoC (System on a Chip) industry. Intel came out with the Intel Galileo Gen 2 which features a full blown 586 Pentium class computer. What’s so awesome? It’s super tiny and just a little bigger than a credit card. So, you can install and run Windows or Linux on it. How freaking cool is that? No noise and about 5 Watts of power consumption… the possibilities are incredible. Put Windows 10 on it for free, here is how. Create your Windows applications in MASM and my Visual MASM, off you go… amazing. Go check out Microsoft’s Internet of Things(IoT).

Presenting “Object Persistence in C#” in Sacramento, CA, March 25th, 2015

Wednesday, March 25th, 20015, I will be presenting “Object Persistence in C#” at the Sacramento .NET User Group (SAC.NET) at the Microsoft Office at 1415 L Street, Suite 200, Sacramento, CA 95814 starting at 6:00 pm. Maria Martinez, Co-Organizer, Sacramento .NET User Group, was kind enough in helping to get this organized. Thank you Maria. I will see you there.

Presenting “Object Persistence in C#” at Bay.NET User Group on April 9th, 2015

I will be presenting “Object Persistence in C#” at the Bay.NET user group at the Berkeley City College located at Room 451A, 2050 Center Street, Berkeley, CA from 6:15 pm – 9:00 pm.

Deborah Kurata, Co-Organizer, East Bay Chapter Leader, was kind enough in helping to get this organized. Thank you Deborah.

I will see you there.


Get every new post delivered to your Inbox.

Join 32 other followers