Thursday, August 14, 2014

Process vs. Data: Which Comes First

Data is the lifeblood, and the process is the body function of enterprise. 

Data is like the life blood of business, processes are like body functions, made up of steps that convert inputs to outputs. process management is to manage known from the flow. Process vs. Data, which one should come first?

Logically, the process always comes first: Processes exist whether they are undocumented/ documented. Businesses have processes before flow-graphing was invented and before computers existed. The process is "what are you trying to do, what result are you trying to achieve." Then, you can ask, "what data do you need." If you don't know what you are trying to do, how can you possibly know what data you might need? At the same time, you can see the value of collecting lots of data, since you never know what might be needed in the future, but that is a different question.

In an ideal world, the process should be primary. Whereas businesses usually use inherited systems as data sources, the process has not to be constrained by them. You should have to construct data adapters for the process to work properly. But in the real world, you have always been constrained by existing systems and databases. The process never stands alone, and rarely is the sole source of data for any meaningful process. So while you consider process primary, you must consider existing data sources as constraints. Ignoring them while designing processes is likely to lead to a bad result and rework.

It’s just another chicken-egg debate: As long as you have the freedom to design what you want to do. Then you design the process, define the data you need along the way... All is good. In many cases, you are often forced to look at existing processes, existing data collection, and the challenges are to make things "lean." Do you need to collect the data, or is there a way to streamline the process and get the same data leaner. Do you collect Big Data and are you benefitting from it? When you have to mix Data Management initiatives and Business Process improvement, only the concerted approach will work and the overlap is inevitable. So it turns to be another chicken and egg debate.

Data drives processes, when multiple groups and / or department and / or people involved in the process. For example, customer onboarding process, dispute tracking process is highly dependent on data, data elements, and business rules for process optimization. In these scenarios, data has a significant role in defining the process and data comes first. A good process includes an understanding of value, function, Data, responsibility, and control flow. Function and Data are thus integral to a good process.

Always emphasize on the Data part: Though process may come first, always put emphasis on the data part. Each step generates data ranging from a simple "done" declaration by an individual performing the step, to the collection of, for example, various data elements needed by steps downstream from a step that has the focus. First keeping the practice you want to incorporate as processes. Data quality, the frequency of occurrence, impediments of decisions for Data variance and many analytical factors have a major influence on the fine parts of a Process Definition. Process definition though has been marked as the principal skeleton of an Enterprise Business Regulations, yet to define it to the finest position, data science has a huge contribution. When you talk about the Business Decision Management, you should have Data analysis part first to get all Social education embedded within a process. Recent development on Digital Transformation can also be invited to make your processes implanted in the womb of OMNI Channels when data analytics and science have been well thought off and recognized within the definition of your process.

Like so many things, it depends: The technical answer would be processed first because someone had to do it the first time in order to measure it. But the reality is that over 90% of the processes of today's world are based on some previous iteration. As new mediums and environments emerge, the initial balance starts heavy on processing data, and then shifts toward data over time. An outsider who comes into an organization and is unable to find anyone who can describe a particular process can reverse engineer that process by carrying out data mining. If the process is mostly linear it is easy to reverse engineer a process. If the process is complex with multiple branching points where different pathways are undertaken on the basis of data, experience, judgment, intuition, and, at times, arbitrary action, it becomes more difficult to derive a process from data.

Hence, it is just another chicken-egg debate, process and data go hand in hand. Processes are defined around some outcomes/KPIs and objectives to improve in a defined region where Data is important to be defined for the same. Similarly, Data has no meaning until it is owned and given some identity. Processes give the Data a shape to be quantifiable as well.


This is a very important topic that will be increasingly strategic in the coming years. Thank you for posting it.

From the inception and premises of Relational Databases, the “separation of the management of the data” from the application was an unquestioned truism. For a while, that was the fundamental precept of RDBMSs. Multiple applications were then allowed to share the same data. Then BPMSs followed suite with the separation of the process from the “application” – here multiple applications were able to share the common processes, customizing them as needed.

Now we are entering a very different dynamic, fluid, and adaptive era with Digital Enterprises. Our old notions of “separation” are being challenged and to support this brave new world, we need to think of Process and Data as two sides of the same coin. Building digital solutions and applications quickly and creatively needs both. But it is not just about Process and Data ..

Perhaps even more importantly, it is also about the Decisioning – the business logic that is often modeled and executed from two essential sources: the heads of experts or knowledge workers and mined from data. The former is the realm of Business Rules and the latter is realm of Predictive models. It is “i” (Intelligence”) in iBPM. Thus Process, Decision, and Data are the three wheels (as the Digital Enterprise is in constant motion) that will drive enterprises in the very near future...

Data appears first, as a result of operation (hence process appears first), the kind of applications after (not the other way around), despite being widely implemented contrary to this approach. In this blog post, I elaborate deeply this reasoning, presented this year at Enterprise Architecture conference Europe.

I agree with Alberto - data comes first! In the real-world data is all around us and in many cases, it exists with no immediate relevance to human existence; nature does what it does presenting data to us. Data is Facts, Observations and Questions about something.
Abby Covert in her book "How to Make Sense of Any Mess" points out that data is "Information in the eyes of the beholder", and that data is always present in the world - we can't always strong control and influence the conversion from data to information. Information is subjective and not objective. Information is whatever a user interprets from the arrangement or sequence of things they encounter.
In the 30 years of architecting, designing and developing software I have found that It is not until we apply intent, method and process that "logic" becomes a first-class useful working thing and that the structured grouping and organization of data (Taxonomy), becomes useful when building Information System in Software.
Years ago, I worked for TI in the group that developed the IEF CASE tool. We had these things call PADs (Process Action Diagrams) and PRADs (Procedure Action Diagrams) and we frequently had energetic conversation and debate regarding data and process. Our user taught us that in the real-world they encounter data fist and then try to use data in the application of getting something done (Process).
The intention and utility of using computer and software systems demands that we create logical workflow, process and task models to be productive when groups and teams of user interact with the data managed by information systems. The application of “logic” is necessary because we are trying to instruct a machine to aid us in working towards an intended goal/outcome. Conversely, when I am having a conversation with a friend regarding my golf swing I am not think about the composition and execution of a logical process; I just want to make sure the facts are indeed facts.
Recently, in designing and developing effective and modern “Contact Management” system, we have found that the adage of “Garbage-in-Garbage-out” still holds water quite well. Bad data going in can create havoc in any communications model that human beings depend on.
Consider this use-case scenario: “It is my intention to call John to review the details of purchasing the home we look at yesterday; I need his phone number to reach him and talk about next steps to take.” If I don’t have the right phone number then I can’t reach John and now I have a do a whole host of other things just because I have a wrong number for him – waste of time, energy, effort and money!
This type of thing happens all-day long in the real-world and the process of calling a contact is going to fail if the facts of a phone number is not a fact. We might think of the “Calling Process” first but I must have data “facts” in hand before I can do anything useful.
No Chicken or Egg problem here – when my body’s bladder generates data and it travels along my nervous system and is via cathexis in my brain, my logical mind is equipped with a cataloged process that details the actions necessary to relieve myself; I find the nearest rest room and the problem is solved. I did not think first – “I need to use the rest room.”

Post a Comment