プロセスマイニングの基本機能である「プロセス発見」は、当初、ペトリネットがベースになっていたが、より現実に近いフローチャートを再現するために、様々なアルゴリズムが開発されてきている。ただ、業界有識者の話によれば、現在実用化されているプロセスマイニングツールのほとんどは、ファジーマイナーと呼ばれるアルゴリズムに基づいたもの（各社独自の改善は行っていると思われる）であると言われている。 同アルゴリズムは、一般にＤFGs(Directly-follows Graphs)と呼ばれる。ペトリネットや、また業務手順をフローチャートとして記述するための世界標準であるBPMN（Business Process Modeling and Notation）と異なり、ノードとノードが直接（Directly）結びつけられたフローチャートがDFGsである。すなわち、分岐ノードが描かれないため、このアルゴリズムでは、どこでどのような分岐が発生しているのか、具体的には、排他的（OR）なのか、並行的（AND）なのか、といったことが把握できない。このため、現状のプロセスを自動的に再現するとはいっても、分岐が明確でない不完全なものになるというのが現実である。もちろん、これについては、BPMN形式のフローチャートへの自動変換や、前述したビジネスルールマイニングの採用などの機能改善が行われてきている。
図１ Petri net、BPMN、Fuzzy Minerのフロー図例 上図でわかるように、DFGsであるFuzzy Minerには、Petri netやBPMNのような分岐ノードが存在しないため、同じプロセスの表現でありながら、Fuzzy Minerでは分岐のルールを判別することができない。
このConvergence/Divergence問題は、プロセスマイニングの分析品質を左右する最大の課題と言える。そこで、近年では、プロセスマイニングのゴッドファーザー、Wil van der Aalst教授が率いる研究者たちが「Object-Centric Process Mining」(1)と称する独自の方法論により当課題の解決に取り組んでいる。 また、myInvenioには、マルチレベルマイニングという機能が実装されており、一つのプロセスについて複数の案件IDを設定することで、プロセスの集約・拡散の状況を加味したフローの再現を実現している。
Latest Process Mining Functionality, Challenges, and Future Evolutionary Trends
1 Latest Functions of Process Mining
Process mining tends to attract attention in terms of technology and tools, but its essence is a theoretical system and methodology (discipline) of data analysis. In fact, as the term “process” mining suggests, it can be considered as a type of data mining. However, unlike data mining, which is a broad concept that targets all kinds of events for analysis, process mining literally targets “processes” for analysis. The basic use of process mining is “process visualization,” and the visualization of processes facilitates the discovery of problems associated with the target processes. As a result, it can play a significant role in process improvement efforts.
1.1 Current Major Functions
As mentioned above, the research of process mining has started from the establishment of the methodology of “process visualization” and the development of tools. It is a function to automatically create a flowchart showing business procedures based on data extracted from IT systems used for business execution, and is called “Process Discovery. Since then, various functions have been implemented as research has progressed and tools have become more sophisticated. The following are the main analysis functions implemented in most of the current process mining tools.
automatically create a flowchart of business procedures and calculate the frequency of work and time required.
compares and analyzes the current process (as-is) discovered based on data with the standard process (to-be), and extracts deviations from the current process.
A function to display the results of aggregation and analysis of target processes from various perspectives in various graphs and tables.
1.2 Latest Functions
In addition, in recent years, the most advanced process mining tools have begun to include the following latest functions.
Business Rule Mining
When there is a flow branching (decision node) in a target process, it automatically discovers the criteria (business rules) that determine the routing based on the data.
Simulation (What-If Analysis)
Simulate how much improvement can be expected by eliminating or automating some of the tasks in the current process visualized by the process discovery function.
For projects that are currently in progress, the system absorbs data related to business execution in real time, detects deviations in business operations, predicts future problems, and alerts the person in charge, suggests the best course of action, or automatically implements improvement measures.
Of the three latest functions mentioned above, business rule mining and simulation analyze past data, i.e., data that has already been completed, while operational support focuses on supporting smooth business execution by sequentially processing data related to unfinished projects. In this sense, it can be said that operational support is a form of IT solution that goes beyond the framework of analysis methodology. For this reason, Ceronis, the largest company in the process mining industry, calls this function “EMS (Execution Management System).
2 Issues to be overcome to make process mining better to be used
As seen in the acquisition of Signavio, a major tool vendor, by SAP and myInvenio by IBM, process mining is increasingly recognized as an important tool that is part of IT solutions. However, there are issues that need to be overcome in order for it to be used properly in business practices and to bring results. In this section, I would like to present the main issues from two perspectives.
2.1 Difficulties in data preprocessing
In data mining, it is said that about 80% of the total time required is spent on data preprocessing such as data collection, extraction, and cleaning. The same is true for process mining. It takes a lot of effort to properly integrate dozens to hundreds of data files extracted from various IT systems, to correct dirty data such as omissions and garbled characters, and to create a “data set” that can be fed into tools for analysis. Factors that make data pre-processing in process mining difficult include the fact that the source of data extraction is various business systems, and thus an understanding of the business systems is necessary. In addition, in order to create a data set to derive analysis results that contribute to business process improvement, it is necessary to understand the business itself and to have some familiarity with business improvement methods.
2.2 Analysis quality of tools
There are two issues that need to be addressed regarding the quality of analysis. One is the limitation of DFGs (Directly Follows Graphs), and the other is the Convergence/Divergence problem.
2.2.1 Limitations of DFGs
The basic function of process mining, “process discovery,” was initially based on Petri nets, but various algorithms have been developed to reproduce flowcharts closer to reality. However, according to industry experts, most of the process mining tools currently in practical use are said to be based on an algorithm called fuzzy miner (each company is believed to have made its own improvements).
This algorithm is commonly called DFGs (Directly-follows Graphs). Unlike Petri nets and BPMN (Business Process Modeling and Notation), which is the world standard for describing business procedures as flowcharts, DFGs are flowcharts in which nodes are directly connected to each other (directly). In other words, since branching nodes are not drawn, the algorithm cannot grasp where and how the branching is occurring, specifically, whether it is exclusive (OR) or concurrent (AND). For this reason, even if the current process is automatically reproduced, the reality is that the branching is not clear and incomplete. Of course, functional improvements have been made in this regard, such as automatic conversion to BPMN format flowcharts and the adoption of business rule mining as mentioned above.
2.2.2 Convergence/Divergence Problem
In process mining, three items, “case ID,” “activity (event),” and timestamp, are essential to draw a flowchart by bundling each activity performed for a case processed in the target process. For example, in the case of an invoice processing process, the individual invoice number attached to each invoice and the activities such as “receipt,” “confirmation,” “approval,” and “payment” for that invoice are extracted from the IT system along with the time stamp.
What we often face in the actual process is that there is no single case ID. Let’s take a concrete example. The figure below shows a general image of the process of an engineering company from order receipt to material procurement.
Since the ordered machine must be manufactured based on the specifications of the ordering company, after receiving the order, the company first designs the machine, then identifies the necessary materials and parts based on the blueprint, and then places an order with the supplier. Since multiple blueprints are created for a single machine, the Blueprint Number is used in the design stage. In addition, the Parts Number is used to identify materials and parts, and at the time of procurement, multiple parts are combined into several parts and a procurement request is issued. In this case, a Procurement Request Number is assigned. In addition, the multiple procurement requests are aggregated to each supplier and an order is placed. In this case, the Order Number becomes the ID for management.
In this way, the processes of convergence and divergence are commonly seen in practice as a single case is processed. In the conventional approach, the construction number at the beginning of the process is used as the case ID, and the entire process is analyzed up to the procurement of materials, but if there is convergence or divergence in the process, a process that is far from the actual situation is reproduced. (For example, the diffused part is recognized as a mere repetitive task.)
This Convergence/Divergence problem is the biggest issue that affects the analysis quality of process mining. In recent years, researchers led by Professor Wil van der Aalst, the Godfather of Process Mining, have been working on solving this problem using a unique methodology called “Object-Centric Process Mining” .
3 Future Direction of Evolution
We have already mentioned that process mining is playing a role as a business support solution beyond the framework of data analysis. In this section, we will discuss how process mining will evolve in the future from a bird’s eye view.
3.1 Process Mining 1.0
Process mining is. The basic function of process mining was “process discovery,” which automatically reproduces the current process from data. This is a “Descriptive Analysis” in that it depicts the current state as it is.
However, what we originally wanted to do was to extract problem areas such as inefficiencies and bottlenecks hidden in the process. In other words, we need to find out what is wrong with the process. Therefore, there is an additional function that can easily tell us where the problem is, such as the processing time of this part is too long or there are too many repetitions. This is a function that belongs to Diagnostic Analysis. In process mining tools, it is generally named “Root Cause Analysis.
The above is an analysis function for historical data, and should be called Process Mining 1.0.
3.2 Process Mining 2.0
When process mining starts to take in uncompleted, i.e., ongoing, case data in real time as a target of analysis, it becomes possible not only to detect deviations but also to predict how long it will take to complete the currently running case, and to predict deviations that may occur in the future. In addition, it is possible to predict how long it will take to complete a case that is currently running, and to predict future deviations. The number of tools that implement such predictive analysis is increasing.
Furthermore, based on the prediction results, tools that can suggest what actions should be taken now to shorten the time required or to prevent future deviations from occurring are also emerging. This is the function of “Prescriptive Analysis”.
Such process mining analysis that deals with incomplete data is a major upgrade of the existing process mining 1.0, and can be called process mining 2.0.
Although predictive and prescriptive analyses are still in their infancy and their reliability is not necessarily high, it is certain that they will be introduced to many companies as valuable solutions to support smooth business execution based on enterprise systems such as ERP through further technological progress in the future.
In this article, I’ll explain the flow of using process mining to improve business processes, contrasting it with the procedure of treatment in a hospital.
Process mining aims to discover various issues and problems hidden in the process by visualizing invisible business processes from the event log data.
In terms of this “visualization of the process”, process mining is often likened to an X-ray. However, just as in the treatment of diseases, the ultimate goal is not the discovery of the lesion (Inefficiencies and bottlenecks) but the implementation of appropriate treatment (improvement measures) and the return to a healthy state, in other words, the realization of an improved “ideal process(to be proess)”.
Let’s start by outlining the flow of medical activities in a hospital. Broadly speaking, there are two stages: the “diagnostic stage” and the “treatment stage”.
The starting point for treatment is when a patient comes in with some kind of symptom such as fever or cough.
First, we will ask questions about the extent of your current symptoms and conduct an interview.
Using an X-ray machine, the area where the lesion is thought to exist will be photographed.
The presence of the lesion is confirmed by looking at the X-ray photograph.
From the results of the X-ray photos, you can determine what diseases the patient have.
In addition, various physical exam and tests will be performed to verify the correctness of the above diagnosis.
The course of treatment is based on the results of the diagnosis and the patient’s wishes. For example, it’s about whether to carry out surgery or how to treat medication.
If it is better to remove the lesion, surgery will be performed.
The treatment is performed by administering medications alone or in conjunction with surgery.
The etiology has been eliminated and the symptoms are gone. Treatment is complete.
Next, we’ll outline the steps to improve business processes along the path of diagnosis and treatment at the above hospital.
●Business Process Improvement
Understanding the current situation – Diagnostic stage
Process with problems – Patient
Select processes that are experiencing problems as phenomena, such as long throughput, high operating costs, customer complaints, etc., as targets for improvement.
Process Setup – Preliminary interview
Basic information related to the process to be improved, such as an overview of the process, the number of processes, and the department or person in charge, will be organized through interviews. If there are any specifications or manuals for the system involved in the process, check them as well.
Process Mining – X-ray
Based on the event log data of the process to be improved, we analyze it using a process mining tool and create a flowchart of the current process.
As is process – X-ray photograph
We analyze the current process from various perspectives, such as frequency and time required.
Problem identification – Diagnosis
Based on the results of the above analysis, we identify the areas that are causing problems or issues as a phenomenon, i.e. inefficient procedures that are taking too long, or bottlenecks that are piling up pending cases.
On-site interview and observation – physical examination
To identify the problem areas, we conduct interviews with the person in charge at the site and conduct observational surveys to identify the root cause.
The root causes of process inefficiencies and bottlenecks are: too many meaningless steps, too many mistakes, too many reworkings, and too few people assigned to deals that need to be done.
Improvement Activities – Treatment Stage
Improvement Policy – Treatment Policy
Once we have identified the various problems and issues related to the process and the root causes of these problems and issues, we plan improvement measures.
As a major improvement policy, it is important to first clarify the objectives, such as reducing throughput, reducing costs, and improving customer satisfaction.
Implementation of improvement measures – Surgery and Medication
There are a variety of options for improvement measures, ranging from major to minor modifications.
BPR (Business Process Re-engineering), which is a zero-based re-engineering of the process, can be compared to surgery. Replacing manual tasks with RPA software robots might be like replacing an artificial heart.
If a small change in procedure could improve the time required, it would be a disease that could be treated with simple medication.
Improved Process (To be process) – Recovery
Once the desired process has been achieved as a result of effective improvement measures, the project is complete.
Just as regular check-ups are necessary in the treatment of a disease, it is important to continuously monitor the target process to ensure that problems do not recur or new problems arise.
このように、プロセスマイニングとデータマイニング・AI、BPMはお互いに補完しあえる関係にあると言えます。プロセスマイニングのゴッドファーザー、Wil van der Aalst教授は、「プロセスマイニングは、データマイニングとBPMをつなぐ橋である」と述べられていますが、まさに、BPMの取り組みにおいて、プロセスに特化したデータマイニングとしての「プロセスマイニング」は大きな役割を果たしていくと思われます。