Business Intelligence and BPM

Business Intelligence and BPM

The assignment focuses on basic concepts of applying business analytics to BPM cases and basics indata wrangling. The assignment comprises two tasks. First task is to illustrate the relationships betweenbusiness intelligence and BPM focusing on the relevance of analytics in BPM. In task 2, you are requiredto apply knowledge of data wrangling using R to process and perform a simple analysis on text files. 

Task 01 

Develop an essay that:

  • illustrates the relationships between business intelligence and BPM.
  • Defines and elaborates the relevance of analytics in BPM (i.e., performance analytics). You arerequired to use literatures, graphs, and detailed business examples/cases to support your statements. 

Task 02  

Data wrangling is a significant process to transform the raw data into format readily to be consumed byfurther analytic processes. Processing text data is considered as a typical task in analytics which involvesdealing with punctuations, numbers and meaningless stop-words and converted a text file into corpusthen term-frequency matrix. You are tasked to perform data wrangling using R on two text files andrequired to address the following questions:

  1. Identify the top 20 meaningful terms and produce graphs for each data set;
  1. Produce a word cloud for the top terms (up to 100) for each data set;
  1. Identify the top 10 topics using R topic model library for each data set and describe/comparetopics.
  1. Remove terms following # (hash tag, for example #trump), remove hyperlinks (for example and remove twitter handles (for example @xxxx) and perform the abovethree steps. What are the differences? 

Hint:  Tutorial relating to wrangling using R. Research on string replacing functions, for examplegsub(), for step 4. 


Task 01 


The term Business Intelligence (BI) refers to technologies, applications and practices for the collection, integration, analysis, and presentation of business information. One of the key purpose of BI is to ensure that business decisions can be made which are supported by the underlying data and not just the gut/instinct of the decision maker(s).

Broadly, the world of analytics can be classified into 3 categories:

  • Descriptive Analytics
  • Predictive Analytics
  • Prescriptive Analytics

Descriptive analytics usually involves the analysis of what happened in the past. It extracts interpretable insights raw data and helps the firm in any future decision making.Predictive analytics typically uses the past data to predict the likelihood of future events/outcomes. Prescriptive analytics allows firms to generate action paths they ought to take for future events.

Business process management (BPM) is defined as the discipline that uses various methods to discover, model, analyze, measure, improve, optimize, and automate business processes.

Relationship between BI and BPM

BPM focuses on understanding the various process, systems and even people involved in the firm’s operations. It seeks to improve these processes to deliver better results, better products, automate the processes etc. Core philosophy of BPM is to thoroughly understand the processes and operations performed in the firm. BPM regards processes as a key asset of the firm’s operations. As, we step into the age of analytics, various firms use information gained from BPM systems to further analyse and improve the systems. This may be done in real time or at defined periods of time. The analysis is usually done by various business intelligence tools that the firm has at its disposal.

These analyses yield key insights into how the firm can further improve and optimise its products and operations to deliver value added products and services to its clients. One of the functions of a BPM is to monitor its products and services. Using the insights from BI tools, it can be incorporated into the processes to monitor in real time how well the firm has improved its products and services. Integration of BPM and BI acts as a bridge between analytics and operations wings of the firm.

With the advent of internet of things, data collection which was long one of the most arduous part of a BPM has been done easily. With most of the data being stored in cloud computing system, firms can easily use BI tools to continuously analyse their operations. Cloud computing also reduces the need for physical data warehouses and in-house technical staff to monitor them. It can increases geographical mobility and allows the firm access its data from different geographical locations without a significant increase in its capital expenditure.

With widespread adoption of BPM and BI tools, firm can improve their strategic planning with the help of actionable insights from the analytics tools. However, it is important for the firm to balance long term planning with short term goals, and not be blind to either one of them. BPM and BI systems help drive support for new business initiatives by producing and analyzing data in real time and help drive the business to new frontiers. Decision making can be decentralized with the company wide adoption of BPM and BI systems. It ensures that the firm can respond to situations without a bureaucratic system which is bound to take a lot of time to make decisions. Integration of BPM and BI ensures greater cross functional interaction and work by diverse parts of the company. BI coupled with BPM help the company root out inefficiencies in their system. This helps to cut down on costs etc. It helps the company stay competitive.

Planning is a very important part of any company’s strategy. BI and BPM tools help the company chart out its capabilities and limitations. This helps the company to be aware of them and overcome them in order for further growth

In the modern age, business process management (BPM) typically use BI tools to improve their decision making, improve profits, cut down on costs, reduce churn rates, identify untapped revenue streams etc. Integrating BI and BPM tools could help the firm take quick decisions on various operations of the firm. This integration leads to creation of dedicated data warehouses to collect data from various operations and analyze them. This reduces potential blind spots for the firm. As more firms migrate towards analytics, it helps them improve their efficiencies. For example, insurance firms have long used analytics to weed out fraudulent claims by analyzing patterns of past claims, and identifying the typical characteristics of fraudulent claims.

However, integrating analytics into their day to day operations isn’t an easy task. Often, ahuman is involved in making decisions which seem rather subjective and involve consideration of many disparate factors. Often times, this has led to incorrect decisions which lead to lost revenue/losses/inefficient processes etc. Adding BI into day to day business operations improves the efficiency of human decision making. By ensuring the availability of up-to-date data, businesses reduce the chances of incorrect/inefficient decisions.

Most of the modern day analytics offerings are designed in such a way that the decision makers have no problem interpreting the data to make their decisions. Most of the underlying data is packaged like a black box from which the decision makers is only interested in the outputs, and not concern themselves with  the internal workings etc.

Real world applications

Ingersoll Rand has used analytics to improve their operations like order management, global inventory and invoicing. They’ve used Oracle ERP suite to collect operations data. Through the use of analytics, Ingersoll Rand discovered the use of incorrect data about supplier lead times. In the past, the company had relied on human knowledge to correct such problems, but now their analytics does it for them. After improving their lead time data, they were able to improve their revenue forecasting.

Another example of analytics being used in improving decision making, is by CUNA mutual. The company is a financial products provider to major credit unions. The number of credit unions in USA has reduced by 24% since 2000. To improve their financial standing, CUNA Mutual researched ways to improve their products and thereby increase their revenues, To do this, the company launched Voyager, an analytics project. It used Microsoft’s SQL server as the primary data warehouse. Data was populated using customer data from sales and marketing teams. After further analysis, it was found that half of CUNA Mutual’s $2.8 billion in revenue was generated by only 3 of its 12 customer segments. To remedy this, company focussed on developing products that were attractive to the other nine. This is an example of a firm which used analytics to identify untapped revenue streams. These revenue streams can further the company’s market share in a fiercely competitive environment. Also, CUNA Mutual altered its marketing campaigns from 4 large pushes to 12 smaller ones. The smaller marketing campaigns help in targeting marketing which increases the likelihood of a customer purchasing one of its products.

DirecTV uses analytics to reduce churn rates. They use predictive analytics to find out ways to keep the customers who want to leave the company. These include free services, offering them a better deal than the one offered by rivals etc. The importance of calling back leaving customers depends on the revenues generated by them. This process reduces the amount of money spent on winning back the lost customers a few months later.

BI pyramid


This report discusses the relationship between BI and BPM in modern day firm. Integration of BI and BPM leads to several important gains for companies like better decision making, cutting costs, strategic planning, delivering better products and services to clients etc. The applications of analytics in various firms is also discussed.

Bibliography/List of References

  1. Business Intelligence Meets BPM: Using Data to Change Business Processes on the Fly,–using-data-to-change-business-processes-on-th.html
  2. Business Intelligence & Analytics,

Task 02

Topic 1 -> DJT Tweets                     Topic-> BHO Tweets

Unlike the first link, “obama”,”hillary” doesn’t feature in BHO’s tweets and “maga” is missing from DJT’s tweets. Prominent inclusions include “price” at the 20th place in DJT’s place which refers to the former HHS Secretary Price. “tax” has jumped to the second place and “will” has claimed the pole position on DJT’s tweets.”Puerto” and “rico” still remain in the top 10.







#Reads in the txt file

tweets01 <- paste(readLines(“F:/DS/tweets_trump.txt”), collapse=” “)

#Strips out hashtags, twitter handles, etc, Comment out for 1st part

clean_tweet01<- str_replace_all(tweets01,”#[a-z,A-Z]*”,””)

clean_tweet01<- str_replace_all(clean_tweet01,”@[a-z,A-Z]*”,””)

clean_tweet01<- str_replace_all(clean_tweet01,”[a-z,A-Z,0-9]*”,””)

#Input to VectorSource() in 1st part will be tweets01

tweets01corpus <- Corpus(VectorSource(clean_tweet01))

tweets01dtm <- DocumentTermMatrix(tweets01corpus)

#findFreqTerms(tweets01dtm, lowfreq=30)

#Cleans out punctuation, numbers, whitespaces and changes the text to lowercase

tweets01corpus <- tm_map(tweets01corpus, removePunctuation)

tweets01corpus <- tm_map(tweets01corpus, removeNumbers)

tweets01corpus <- tm_map(tweets01corpus, stripWhitespace)

tweets01corpus <- tm_map(tweets01corpus, content_transformer(tolower))

#Removes stopwords and trump

tweets01corpus <- tm_map(tweets01corpus, removeWords, c(stopwords(“english”), “trump”))

#Creates DTM for further analysis

tweets01dtm <- DocumentTermMatrix(tweets01corpus)



#inspect(tweets01dtm [1,1])


#Sorts the most frequent terms in descending order

tweets01matrix <- as.matrix(tweets01dtm)

tweets01frequency <- colSums(tweets01matrix)

tweets01frequency <- sort(tweets01frequency, decreasing=TRUE)



#Plots the barplots of the most frequent terms in tweets

barplot(tweets01frequency[1:10], las = 2, col =”red”, main =”Top ten – tweets 01″, ylab = “TF”)

barplot(tweets01frequency[1:20], las = 2, col =”blue”, main =”Top Twenty – DJT Tweets”, ylab = “TF”)

tweets01words <- names(tweets01frequency)

#par(mfcol = c(1, 2))

#Prints out the word cloud

wordcloud(tweets01words, tweets01frequency, max.words = 100, colors=brewer.pal(8, “Dark2”),random.color = FALSE)

#Barack Obama Tweets Analysis

tweets02 <- paste(readLines(“F:/DS/tweets_obama.txt”), collapse=” “)

#Tweet the following 3 lines for first part

clean_tweet02<- str_replace_all(tweets02,”#[a-z,A-Z]*”,””)

clean_tweet02<- str_replace_all(clean_tweet02,”@[a-z,A-Z]*”,””)

clean_tweet02<- str_replace_all(clean_tweet02,”[a-z,A-Z,0-9]*”,””)

#Input to VectorSource() in 1st part will be tweets02

tweets02corpus <- Corpus(VectorSource(clean_tweet02))

tweets02dtm <- DocumentTermMatrix(tweets02corpus)

#findFreqTerms(tweets02dtm, lowfreq=30)

tweets02corpus <- tm_map(tweets02corpus, removePunctuation)

tweets02corpus <- tm_map(tweets02corpus, removeNumbers)

tweets02corpus <- tm_map(tweets02corpus, stripWhitespace)

tweets02corpus <- tm_map(tweets02corpus, content_transformer(tolower))

tweets02corpus <- tm_map(tweets02corpus, removeWords, c(stopwords(“english”), “obama”))

tweets02dtm <- DocumentTermMatrix(tweets02corpus)



#inspect(tweets02dtm [1,1])


tweets02matrix <- as.matrix(tweets02dtm)

tweets02frequency <- colSums(tweets02matrix)

tweets02frequency <- sort(tweets02frequency, decreasing=TRUE)



#Prints the bar plots of most frequent terms in Barack Obama’s tweets

barplot(tweets02frequency[1:10], las = 2, col =”red”, main =”Top ten – BHO”, ylab = “TF”)

barplot(tweets02frequency[1:20], las = 2, col =”blue”, main =”Top Twenty – BHO Tweets”, ylab = “TF”)

tweets02words <- names(tweets02frequency)

#par(mfcol = c(1, 1))

#Prints out the word cloud

wordcloud(tweets02words, tweets02frequency, max.words = 100, colors=brewer.pal(8, “Dark2″),random.color = FALSE)

#This code block finds out the most used topics

tweetsdtm = c(tweets01dtm, tweets02dtm)

ldaOut<-LDA(tweetsdtm, 2, method=”Gibbs”)

ldaOut.terms<- as.matrix(terms(ldaOut,20))

ldaOut.topics<- as.matrix(topics(ldaOut))



#Prints the barplots side by side

par(mfcol = c(1, 2))

barplot(tweets01frequency[1:10], las = 2, col =”red”, main =”Top ten – DJT Tweets”, ylab = “TF”)

barplot(tweets02frequency[1:10], las = 2, col =”green”, main =”Top ten – BHO Tweets”, ylab = “TF”)