Thursday, April 4, 2019

Data warehousing and data mining

selective discipline w atomic count 18housing and selective tuition diggingAbstractThis paper aims to discuss about breeding w atomic number 18housing and info mine, the tools and techniques of info dig and info w arhousing as well as the benefits of practicing the judgment to the nerves. It also includes the trends and exertion in entropy wareho wont and information dig in current stock communities.Keywords infobase, entropy wareho function, entropy exploit, informationbase management.IntroductionOrganisation commits information systems to record and retrieve data from daily acts. The information systems via the database that link to it depicts valuable data for making great and strategic finalitys in regards to the well-being of a company. An organisation elicit phone the expectation that is yet to come from the data that they possessed. The data jakes also be apply to provide possible solutions to overcome the problems that they faced, and even, they put up utilization the data to obtain competitive gain in their line of credit environment. infobase has reduces, if not in some place, vanish the superannuated method of storing and keeping the information, that is, by dint of the usage of the traditional filing system. The neuter towards digitization of data and the establishment of data repository has created a freshly term in the sphere of influence of information systems, freshly position in the organisation, and a new mood of doing business and daily transactions in human life.This paper will discuss except about the two terminologies which is data storage storage store and data mining from the perspective of database management in the organisation. At the same term, this paper will also include some wooings and burdens about data warehouse in the organisation according to real situation based on the literatures.According to William H. Inmon, data warehouse is a set of integrated, subject oriented datab ases introductioned to concord Decision Support Systems (DSS) functions, where each series of data is precise to some period of time. It is said that data warehouse contains atomic data and lightly quit the data.On the otherwise hand, data mining is the search for valuable information in large intensitys of data (Weiss Indurkhya, 1998). It is the work on of nontrivial extraction of implicit, previously un make don and potenti all toldy utile information such(prenominal) as cognition rules, constraints, and regularities from data stored in repositories using pattern recognition technologies as well as statistical and numeric techniques (Technology Forecast, 1997 Piatetsky-Shapiro and Frawley, 1991). As mentioned earlier, umpteen organisations nowadays use calculating machines especially through the usage of information system to put in triggericulars of business transactions such as records of banking operations, sales of retails, productions of factory, tele communicati on theory and other transactions. Consequently the data mining tools are used to expose positive potentials and association from the data collected.Background of data reposition and data miningThe following part point up the historical evolution of the database and directly discuss about data warehouse and data mining. A brief history of data depot and data mining are included. Furthermore is the pick outs faced in the former(a) years of implementing the concept of data storage and data mining and where two concepts are useful. data warehousing started in the late 1980s from the IBM lab and the responsible researchers are Barry Devlin and Paul Murphy. They started by the development of business data warehouse for decision support surroundings. In the early 1990s, it became a trend for organisations to meet the growing demand for organising information.However Haisten (1999), a columnist for schooling heed Website, mentioned that the concept of data warehouse take shape in ea rly 1970s through a study that started out at MIT with the aim to provide optimal technical architecture.And now, the next generation of data warehousing called Trend in information Warehouse (TDWI) is mushrooming and become popular in some organisations that use information as their vital capitals.The emergence of data mining began in the late of 1980s and it flourished by 1990s. There are triple roots that can be traced back along tercet family lines on the origin of data mining, which are the classical statistics, artificial intelligence, and machine learning. In ordinate to automate the cover of extracting the data which are increased e very single time, human has increased the power of computer and data storage. For that reason, the amount of data becomes huge and more complex. Primarily, Bayes theorem (1997) and Regression analysis has identify patterns in data. The data mining is actually the process or method by using greater discovering in computer science engineering such as neural networks, clustering process, genetic algorithm and decision trees. Data mining can be said as a method to help with the collection of utterance of demeanor.Ayre (2006) pull in tongue to in his paper that todays data mining techniques is due to the work of mathematician, logicians, and computer scientist join unneurotic to create Artificial Intelligence (AI) and Machine Learning dated back from the 1950s. That was a very base spark for data mining ideology. As mention earlier, in the 1960s, AI and statistic practitioners created new algorithm such as regression analysis, maximum bidlihood estimates, neural networks, bias reduction, and linear model.Also in 1960s, the field of information retrieval (IR) made its contribution in the form of clustering techniques and similarity measures. At these time techniques were applied to text document, but they would later be utilized when mining data in databases and other large, distributed data sets (Dunham, 2003).In 19 97, Connecticut-based Gartner Group report has mentioned about data mining and artificial intelligence are at the top five ranking of major technology areas that will clearly have a main crash transversely the whole scope of business unit within the incoming three to five years. Presently, data mining techniques and tools are being prolonged to the variety of areas. For instance, the data mining tools like intelligent text-mining system will extract the text waste pertinent to drug user queries.The above is the process of how the data is transport to database and data warehouse and selection process by using data mining techniques and technology. And therefore it show us how the information form by the translating the data to be deploy in business.Approaches of data warehousing and data mining in various industriesThe industry of finance, sales and marketing, administration and others should master information as corporate source but the many local narrow systems that held that i nformation solely did not give way the incorporated commercial trip uppoint that was required. (Inmon, 2007) still though practicable data is a greater asset to the organisation, it seemed data is usually not making use to its full capable. Therefore, data warehouse basically is to enable users appropriate footing of admission to breaking apart and complete debate of the organisation, supporting forecasting and decision-making process at the managerial stage. Additionally, data warehouse can get information unity by carry data from dissimilar data foundations into centre of database. Users from contrastive department for instances, can view the data from consistent single one place repository. The layer of data in data warehouse makes the information consistent by enable data around the data warehouse to be hunt in business call as against to using database terminology. The establishment of data that enforce how business terms are declared or calculated are also defined i n the metadata layer and then served to the users. Because of the data in the data warehouse is non-volatile but it mustiness be design to adapt the changes periodically. It is because terminologies use in business cannot run from changes.Mannino and Walter (2004) in their study about the refreshment of data warehouse stated that data warehouse refreshment is a complex process comprising many tasks, such as extraction, transformation, integration, cleaning, separate management, history management, and loading. This study is base on interviewed of 13 organisations and the author conclude that daily refresh during nonbusiness hours were the or so common policy.Sometimes data warehouse is not fully utilized by organisation or it being used by company but not all departments. In a case studied by Payton (2005) conclude that there are three factors wherefore data warehouse is frustrated them. It is because marketings lack of trust in the data in CDW (Corporate data warehouse) market ings low perceived quality of the data and marketings perceived lack of incorporation of their requires in the design of the data warehouse and data warehouse interface.Data mining in the industries like information provider as library involved in digital libraries gain benefits from it as they found the method to classify information automatically and apply new way to clustering the subject called MetaCombined the project. Besides database, data mining can be useful in a variety data types like text, spatial data, temporal data, images, and other complex data.Data warehousing and data mining in telecommunicationThe telecommunication industry is fast fitting the main user of high sum of money information system. The problem faced by telecommunication industry is the generation of information which is too fast and in tremendous condition. The difficulties occur when a user, either a manager or high executive, compulsions access to stored information. If the time is not the issue t o search what they want in that gentle of stored data where they put in different places, it will not be an issue at all but time limitation is consuming. For instance, in fix up to produce a report regarding subscriber, an executive need to extract the data, do some analysis, and some other step to make it presentable to their officer. What else can lift all this besides technology? The exact question to ask is what is the technology that can be very helpful in this situation? The answer is through the application of data warehousing and data mining.In real case studied by Papaiacovous, Bramblet, and Burgess (n.d) in a paper titled Data Warehouse A telecommunication Business Solution they described about the difficulties to produce report. They then design personalized systems which exceed the traditional borders of data warehousing systems by assembling and keeping only important data, analyzing and transforming the data, and then summarizing and rearranging it in according to the demands of the user.Another interesting article by Gomez (1998), expressed the hope that cellular companies and other communications firms to strongly consider data warehousing as a way to achieve competitive advantage. The author also reviews new way to data warehousing that have established undefeated in compliant concrete business benefits. Service providers realize due to the competition in the marketplace, they need to provide the best for their client or risk to lose them. It is because guest can simply change their telecommunication serve up provider if they are not satisfied with their current provider. So the provider must get the knowledge in customers hand about what they want actually. After all the data about the customer are collected via online and phone survey, a data warehouse can enhance the executive to analyze and surgical incisions customer into groups by their product usage patterns, demographic characteristics, etc.Telecommunications companies produce t remendous quantity of data. These data consist of call detail data, which describes the calls that cross the telecommunication networks network data, which explain the position of the ironware and software components in the network, and customer data. Data mining can be used to uncover useful information buried within these data sets.Telecommunication companies might counter fraud from customer that intends to use the service without paying for it. It happens when the users register and manipulate the registration information. The most regular way for identifying fraud is to bring into being a profile of customers calling behaviour and compare recent activity against this behaviour. Thus, this data mining application relies on deviation detection. The calling behaviour is captured by summarizing the call detail records for a customer.Here is the issue on data mining. In the customer case study by the company ECtel n order to sell their data mining product for fraud detection calle d FraudView noted that selling data mining product to a telecommunication provider has been traditionally difficult because they dont have data mining experts on staff who can work conventional data mining tools. Additionally, there are many ways to run away from paying for telecommunication services, from stealing phone card to bypassing phone circuitry. ECtel created FraudView, the solution that uses SPSS Inc.s move data mining workbench, which enable the detection of telecommunications fraud in real time.Data mining in telecommunication industries is not limited to detect fraud only but it also can be used as network fault isolation, marketing or customer profiling, etc. This is owing to the three main sources of telecommunication data which are call detail, network, and customer data.Data warehouse and data mining in monetary servicesHow a retail bank can truly understand and predict its customers needs to the point where it can design product and services that suit those nee ds? One way of looking at customers can be from the standpoint of channel usage. In the UKs Llyods Bank/TSB merger, data were sourced from both their data warehouse, and then used to segment the customer base by service channel usage. Customers were al sendd to segment on their usage of the following channels ATMs, automated (direct debits/standing orders), cards (credit card and debit) and telephone (Peppard, 2000).fiscal institutions struggle with the large amount of data on every transaction deal. Data warehouse helps monetary service organisations to analyse large, complex, and rapidly growing data volumes in a quicker way for disclose decision making and faster speed back to the market.Fundamentals of data mining in finance are coming from the need to forecast multidimensional time series with high level of noise, gibe specific efficiency criteria, make coordinated multiresolution forecast, and also incorporate a stream of text signals as input data for forecasting models (K ovalerchuck Vityaev, 2002 ).As noted by Kovalerchuck Vitayaev, four main reason why data mining need to be implemented in finance is because the emergence of high volume databases such as commercial data warehouse and computer automated data recording advances in computer technology such as faster and bigger computer engines and parallel architectures fast access to vast amounts of data, and the ability to apply computationally intensive statistically methodology to these data.Data mining is used to forecast the target variable, performing the contribution varies in percent within todays closing price and the price five days later, along with next days prediction.Data warehouse and data mining in wellness serviceIn healthcare there is not much transaction as business environment. The data is about outpatient, visits to doctor office, procedure and so forth. Instead of numerical data, healthcare has textual definition if the different medical counters. And there is a little bit p roblems here, where the technology that own a old method of data warehouse is created to manage process of transacting data that is very conquered by arithmetical information. When textual, non-transactional information is come across, the old method data warehouse technology nowadays is simply at a defeat to handle healthcare information. (Inmon, 2007). and then, if the data is not a number but a textual it must be kept with different grounds of phrase. It just likes a different language. In order to be bannerized, there has to be creation of same vocabulary for instance, with the purpose to gain understanding for all. Then it can be kept in the data warehouse.In a case study indite by Kumar and Raval (n.d), they traced a large global pharmaceutical, which has a huge data of clinical trials for a number of drugs projects. Due to data collection and analyses operations that are broadening across the world, it is harder to implement data standards. Even harder to enforce was the p rogramming and validation standards that are required of pharmaceutical companies. Primarily, a data warehouse is an operational middle ground and disparate and incompatible to a big quantity of systems put together to versatile collection from end user platform.In another case, Whiting (2001) reported a healthcare bod Intermountain Health that used data warehouse to make an analysis handling provided to its cardiovascular patients for five years. From the result, it improves service provided after the patients return home.These are the data mining in healthcare and insurance where it can give beneficial such as providing claims analysis, it means determine which medical procedure are claimed together. It helps in predicting which customer will buy new policies and can identify behaviour pattern or risky customer and also prevent fraud.Data warehouse and data mining in retail industryThe gainsay in retailer business actually is inundate of data, the battle of data and expired dat a. To cope with these challenges, many retailers are building unified repositories of data known as data warehouse.In the early effectuation of data warehousing technology in 1990s, the retail business has gained benefits of practical data warehouse. From the daily historical sales reporting database created over past few years ago, retailer can expanded the use of analytical systems to support and produce vital decision.The retail industry is going through a transformation. Data warehouse enable retailers to carry out on their major products, including activities such as inventory replacement, purchasing, and trafficker management across multiple other multiple. Financial planning, adjusting for stock outs to seed a top-down financial plan provides all of the data necessary to support well-organized process for the confirmation of invoice accuracy to strategy-based set solution.Simple application that can implement the concept of data mining for retail industries are SQL master of ceremonies 2008 and Microsoft Office Excel 2007. To stay competitive, retailer must understand not only current consumer behaviour but must also be able to predict future consumer behaviour. Accurate prediction and an understanding of customer behaviour can help retailers keep customers, improve sales, and extend the relationship with their customers. SQL server 2008 provide predictive analysis through data mining and Microsoft Excel 2007 offer data mining capabilities that can help retailers make better decision.The application that is common for business retail in data mining such as market basket analysis, fraud detection, database marketing, sales forecasting, and also merchandise planning and allocation. Data mining is so beneficial in retailer industriesRecommendationsIn the business world a transaction is repeated again and again and many of them deal with data in numerical. The same activity repeats with different customers and different figures. To release from this mes s, data warehouse and data mining provide solution. Even though data warehouse and data mining is a strategic investment to the business world but it can be risky without a proper understanding of the concept. Governance or control is important to support the implementation of data warehouse and data mining. There must be a proper standard to ensure compatibility in processing the data especially for textual data used in the health industry. There should also be a policy and to manage the data warehouse. It is highly recommended that to be winning in the implementation of data warehouse or/and data mining, an organisations are required to have panoptic or comprehensive knowledge about the data in their company. This is to guarantee that a well incorporated data warehouse can be constructed. A well structured data warehouse consequently will help organisation to exploit via data mining the data that they have. Organisation should also know what exactly they want to implement in th eir organisation so that the function tools for data mining can be used. And finally, a strong support from top management is important to deploy data warehouse and data mining because the investment on these is not cheap.ConclusionInsufficient of data is no interminable a trouble but lack of ability to breed valuable information from data is the issue today. The answer for those issues is through the implementation of data warehouse and the power to use data mining techniques and tools. Nevertheless, the identification and the awareness of data warehouse and data mining in the organisation should take into consideration many aspects regardless of what industries. The aspects include support of the top management, understanding of the data needed by the organisation, governance and policy, the right design of the data warehouse, and the right tools or techniques for data mining.BibliographyDunham, M.H. (2003). Data mining introductory and advanced topics. top(prenominal) Saddle River, NJ Pearson Education, Inc.Kovalerchuk, B., Vityaec, E. (2002). Data mining in finance advances in relational hybrid methods. ground forces Kluwer Academic Publisher.Wang, J. (2003). Data mining opportunities and challenges. USA brain Group Publishing.Keng Siau. (2003). Advanced Topics in database research. USA Idea Group Publishing.M. Kumar Sagar., Raval, H. (n.d). Data warehousing in pharmaceutical and healthcare an industry perspective. Retrieved January 10, 2010 from http//www2.sas.com/ legal proceeding/sugi24/Dataware/p115-24.pdfMannino, V. M., Walter, Z. (2006). A mannikin for a data warehouse refresh policies. Decision Support System, 42, 121-143. Retrieved January 10, 2010 from www.sciencedirect.comSyncort Inc. (2010). Business drivers and enabling technologies for clickstream data warehouse initiatives White Paper. Retrieved from www.syncsort.com/clickstreamBalog, K. (2004). An intelligent support system for developing text classifies. Retrieved January 10, 20 10 from http//balog.hu/itm/thesis.pdfSang Jun Lee , Keng Siau. (2001). A review of data mining techniques. Industrial Management and Data System. 101/1, 41-46. Retrieved January 10, 2010 from http//www.emerald-library.com/ftKarthik Jayashankar. (2007). Data mining tools for analytics application in retail. Information Management Online. Retrieved January 10, 2010 from http//www.information-management.com/white_papers/10000547-1.htmlHackney, D. (1999). A data warehouse is subject-oriented. Are they any rules to go about delineate the subjects? Information Management Online. Retrieved January 25, 2010 from http//www.information-management.com/news/1331-1.htmlAdelman, S., Moss, L, (1999). Data warehouse goals and objectives. Part 3 Long term objectives. Information Mangement Online. Retrieved January 25, 2010 from http//www.information-management.com/issues/19991101/1564-1.htmlBertman, J. (2005). Dispelling myth and creating legends for your e-biz intelligence warehouse. Power Point Slides. Retrieved from www.dgigusa.comLujan-Mora, S., Trujillo, J., Il-Yeol Song. (2006). A UML profile for multidimensional modeling in data warehouse. Data Knowledge Engineering, 59, 725-769. Retrieved January 25, 2010 from http//www.sciencedirect.com.ezaccess.library.uitm.edu.my/science?_ob=MImg_imagekey=B6TYX-4HWXJXG-1-2R_cdi=5630_user=6533825_pii=S0169023X0500176X_orig=search_coverDate=12%2F31%2F2006_sk=999409996view=cwchp=dGLbVtz-zSkWAmd5=35d7b25297f3ee013bded90b43ecf5bbie=/sdarticle.pdfShin-Yuan Hung, Yen, D., C., Hsiu-Yu Wang. (2006). Applying data mining to telecom churn management. Expert System with Application, 31, 515-524. Retrieved February 12, 2010 from www.elsevier.com/locate/eswaWeiss, G., M. (n.d). Data mining in telecommunications. Retrieved February 12, 2010 from http//citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.955rep=rep1type=pdfLamont, J. (2000). Datawarehousing in the telecommunications industry. KMworld Magazine. Retrieved February 12, 2010 from http//www.kmworld.com/Articles/Editorial/Feature/Data-warehousing-in-the-telecommunications-industry-9153.aspxGomez, J. (1998). Data warehousing for the telecom industry. Information Management Online. Retrieved February 12, 2010 from http//www.information-management.com/issues/19981201/260-1.htmlPapaiacovou, D., Bramblett, L., D., Burgess, J. (n.d). Data warehouse A telecommunicaitons Business Solution. Retrieved February 12, 2010 from http//www2.sas.com/proceedings/sugi22/DATAWARE/PAPER135.PDFThompson, B. (2005). Information and communications technology and industrial property. Journal of Property and Investment Finance, 23 (6), 506-5015.Peppard, J. (2000). Customer Relationship Management (CRM) in financial service. European Management Journal, 18 (3), 312-327.Rogers, G., Joyner, E. (n.d). Mining your data for health care quality improvement. Retrieved February 12, 2010 from http//www2.sas.com/proceedings/sugi22/DATAWARE/PAPER135.PDFSilver, M., Hua-Ching Su., Dolins, S. B. (n .d). Case study how to apply data mining techniques in a healthcare data warehouse. Retrieved February 12, 2010 from http//www.himss.org/ heart and soul/files/jhim/15-2/him15208.pdfBach, M., P., Cosic, D. (2008). Data mining usage in health care management literature survey and decision tree application. Med Glas, 5 (1), 57-64. Retrieved February 12, 2010 from http//www.ljkzedo.com.ba/M8_10.pdfInmon, B. (2007). Data warehousing in a healthcare environment. Administration Newsletter. Retrieved February 12, 2010 from http//www.tdan.com/view-articles/4584McEachern, C., Stern, L, Bell, L. (1998). Data warehousing in the health care industry Three perspective. Information Management Online. Retrieved February 12, 2010 from http//www.information-management.com/issues/19980301/696-1.htmlWhiting, R. (2001). Data analysis to health cares rescue. IT helps health-care group identify best clinical practices. Infrormation Week. Retrieved February 12, 2010 from http//www.information-management .com/issues/19980301/696-1.htmlHaisten, M. (1999). The next stage in data warehouse evolution, part 1. Information Management Online. Retrieved February 12, 2010 from http//www.information-management.com/news/946-1.htmlAyre, L., B. (2006). Data mining for information professionals. Retrieved February 12, 2010 from http//techessence.info/files/Ayre_DataMiningForInformationProfessionals_June2006.pdfRoss, D. (2005). Retail data warehousing the-state-of-the-art. BeyeNetwork. Retrived February 12, 2010 from http//www.b-eye-network.com/view/769Adams, M. (2008). Microsoft SQL server predictive analytics for the retail industry. Retrieved February 12, 2010 from http//74.125.153.132/search?q=cachekCA9HUfe0VcJdownload.microsoft.com/download/6/9/d/69d1fea7-5b42-437a-b3ba-a4ad13e34ef6/PredAnalyticsRetail.docx+ prophetic+Analytics+for+the+Retail+Industry+SQL+Server+Technical+Articlecd=1hl=enct=clnkgl=myRussom, P. (2009). Next generation data warehouse platforms. Retrieved February 12, 2010 from http//download.101com.com/pub/tdwi/Files/TDWI_BPR_NextGenDWPlatforms_Q409_r.pdfPayton, F., C., Zahay, D. (2005). why doesnt marketing uset he corporate data warehouse? The role of trust and quality in acceptance of data ware-housing technology for CRM applications. Journal of Business Industry Marketing. 20 (4), 237-244. Retrieved February 12, 2010 from www.emeraldinsight.com/0885-8624.htm

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.