Business Intelligence Essay Example
Part I Questions
- Generally, the term warehouse refers to a building used for the storage of goods. It is mostly used by manufacturers, importers, exporters and wholesalers. On the other hand, data warehouse refers to a relational database designed for querying and analysis of data other than for processing transactions. Mostly, a data warehouse comprises of historical data derived from transactional data. However, in modern data warehouse, they also constitute data from other sources other than the transaction data. The main role of a data warehouse is separating analysis workload from the transaction workload. On the other hand, it plays an important role in enabling an organization to consolidate data from other sources other than the transaction data only.
While highlighting the environment of a data warehouse, similarly with other warehouses, it includes an extraction (preparation of data from other sources), a transportation (from the extraction region to the point of data use), and transformation (modifying the extracted data for use). Lastly, it has a loading solution, online analytical processing unit engine, tools used for client analysis and lastly applications that manage the processes of gathering data and delivering the data to the business users.
Other than the highlighted definition of a warehouse, there also exist some definitions provided by other people. According to Bill Immon, “a data warehouse is a subject oriented, integrated time variant and non-volatile collection of data in support of management’s decision making process.” With his definition, it is important defining the terms.
Subject oriented as with a data warehouse, means that it can play the role of analyzing a particular subject area, for example, as a data warehouse, it can play the role of sales, purchasing or other departmental roles within an organization.
Integrated regarding a data warehouse, means that the warehouse incorporates or else constitutes data from different sources. More so, despite the different sources of data, data warehouse has only a single way of identifying the data.
Time variant means that a warehouse incorporates historical data. On the other hand, historical data is available for retrieval at any time in the future.
Non-updatable referring to a data warehouse, means that the operational update of data within the warehouse does not take place within its environments. For this reason, the warehouse does not require transaction processing of data.
- Text mining refers to the act of discovering and extracting of interesting, non-trivial knowledge from free or unstructured text. On the other hand, natural language processing (NLP) refers to the process or attempt of extracting a fuller meaning and representation of the text. In other words, NLP means determination of the actual meaning of a given text by determining who has done what, to whom, where and when. This is a more precise method of analyzing a given text as it makes the use of linguistics concepts, such as part-of-speech and the grammatical structure of language used in the text. NLP mostly deals with anaphora (interrelation between the parts of speech used in a text) and ambiguity (the different meanings derived from the text).
While relating to text mining, NLP plays a major role in the analysis of data and finding new knowledge and meaningful patterns of a large collection of text. Some of the advantages of NLP in text mining include knowledge of the language during text mining and as well during the application of different algorithms in text mining, it gives the user (learner) of the language an opportunity of learning the deeper meaning and understanding of the language. This is in pattern discovery over some processed texts and using the sate-of-the-art during text analysis.
On the other hand, while using this method in text mining, users often experience some challenges during the implementation process of the NLP process. The common challenge experienced while using this method is the inability of determine or recognize ambiguity of sentence delimiters. On the other hand, recognition of regions such as lists or even tables that are unsuitable for deep analysis remains as another challenge while using the NLP process.
- Information mining is an important activity in any organization. It involves mining (extracting) information raw data to form a data readily usable by the management in its decision-making process. There are different methods used by enterprises while carrying out this activity of information mining. Some of the tools and techniques used in information mining include data, text and lastly web mining.
Data mining as related to information mining refers to the process of analyzing large databases (mostly analysis of large warehouses of data such as the Internet) with the aim of discovering new information, identify the unhidden patterns in the stored data and as well discover the behaviors of the stored data. Data mining usually is an automated process of analyzing huge amounts of data aiming at discovering possible future patterns and traits of the stored information. Data mining, analyses datasets of some rational databases within a multiple dimension and angles of analysis and produces a summary of the general trends found in the dataset, its relationships and models fitting the dataset. The main uses of data mining are business intelligence and as well risk management. Considering that data mining as an analysis process deals with huge amounts of data, it affects the decision-making process directly.
Text mining as related to information extraction, this method of information analysis deals with data in the form of texts other than recorded data. Its definition is an automatic process of discovery of hidden patterns and traits of unknown information from data in texts form. Generally, text data comprises huge amounts of data stored on the Internet and World Wide Web. They have a high interrelation with data mining but differ in their methods used in the analysis. Unlike data mining, this method of information mining has its basis on the language used in the text form of data. Natural language processing (NLP) is the main method used in text mining. This deals with a linguistic approach of extracting information. Lastly, this method of extracting information induces some hidden traits found in any text data. Its use is aside of business applications and for scientific research.
Web mining as related to information gathering, web mining is an automatic crawling and extraction of relevant information from activities and unhidden patterns found on the internet. In businesses, this process of gathering information mostly applies while tracking customer’s online behavior in terms by tracking cookies and hyperlinks. Web mining works through sending agents to certain target sites such as competitor’s sites. Through these agents, web miners get an opportunity of learning information from the host web server, therefore, gather information useful in analyzing the website. Through the learnt knowledge, web miners have an opportunity of establishing better customer relationships, offers and, therefore, able to target potential buyers with better deals.
Regarding the three methods of information gathering, they are similar in that all have a basis of gathering their information. They all use the Internet as part of their information source. However, there lies the difference in their methods of analyzing information. Text mining mostly uses a linguistic approach of analysis while the other methods use a web related method of analysis.
- Web 2.0 refers to the second generation of web services available in the World Wide Web. These services play an important role in people’s lives in that they let people collaborate and share information online. With the second generation of web services (web 2.0), users have a better experience in that they experience closer desktop applications as compared to the traditional web services. On the other hand, BI (business intelligence) refers to tools and systems playing an important role in the planning process of any corporations. This software plays important roles in business enterprises in that they allow the companies to analyze, store, access and as well gather corporately that aid the decision making process. Relating web 2.0 revolution with business intelligence, its revolution and development is an important aspect in business intelligence. Higher performance of desktop computers and single CPU server systems and analysis of data is faster while using the web 2.0. These are some of the underlying relations between web 2.0 and BI.
Virtual worlds as used in business intelligence refer to a computer-based and simulated environment. In business intelligence, virtual worlds are a computer-simulated environment. With the advancing technology, virtual worlds are materializing and their application is in different areas. Some of the areas of application of the virtual worlds are military simulation or the virtual training environment used by businesses. Considering the high costs incurred while undertaking training in organizations, virtual words are serving by reducing these costs. As with business intelligence, training on data collection from different sites is easier. On the other hand, the trainee progress analysis and tracking on the other hand is easier.
Part II Questions
The term ETL refers to the extraction, transformation and loading. As a scheduled data integration process, it includes the extraction of data from external sources, its transformation to an appropriate format and lastly loading of the data into a data warehouse. The four steps in the process include:
This refers connecting the sources of the data and later selecting and collecting the required data for the analysis in the data warehouse. Usually in this process, there is consolidation of data from numerous sources. In this step, there is the conversion of data into a format suitable for the transformation process.
This is the second step in the ETL process. This step involves execution of functions or rules to the extracted data aiming at converting the data into standard formats for storage. This process includes the valuation process and as well the rejection of data if it is not acceptable. The level of manipulation during the transformation process depends on the nature of the data. Good data undergoes few transformation processes as compared to inapplicable data.
This is the last step in the ETL process. In this process, there is the importation of data from the other two steps (extraction and transformation) into the target warehouse for analysis. There are different load processes. Some processes insert each record of data as a new row into the table of the warehouse using SQL insert statements. On the other hand, there are processes that include a massive bulk insertion of data through the use bulk load routine.
- A balanced scorecard refers to the strategic planning and management system.
This system applies extensively in businesses and industries worldwide. It is an important system of management in that it aids an organization in aligning its activities with the vision of its mission statement. On the other hand as a business strategy, it plays a major role in improving and promoting both internal and external communications. More so, balanced scorecard in an organization monitors the performance of a business organization. This aids in monitoring the success of an organization in meeting its goals and as well the basic performance of the organization’s employees. Therefore, a balanced scorecard is a balanced approach used by the management of organizations with the aim of meeting its goals.