Premium Essay

Data Minig

In:

Submitted By aysenurenc
Words 3471
Pages 14
512

Use of Data Mining in the field of Library and Information Science : An Overview
Roopesh K Dwivedi
Abstract
Data Mining refers to the extraction or “Mining” knowledge from large amount of data or Data Warehouse. To do this extraction data mining combines artificial intelligence, statistical analysis and database management systems to attempt to pull knowledge form stored data. This paper gives an overview of this new emerging technology which provides a road map to the next generation of library. And at the end it is explored that how data mining can be effectively and efficiently used in the field of library and information science and its direct and indirect impact on library administration and services.

R P Bajpai

Keywords : Data Mining, Data Warehouse, OLAP, KDD, e-Library 0. Introduction

An area of research that has seen a recent surge in commercial development is data mining, or knowledge discovery in databases (KDD). Knowledge discovery has been defined as “the non-trivial extraction of implicit, previously unknown, and potentially useful information from data” [1]. To do this extraction data mining combines many different technologies. In addition to artificial intelligence, statistics, and database management system, technologies include data warehousing and on-line analytical processing (OLAP), human computer interaction and data visualization; machine learning (especially inductive learning techniques), knowledge representation, pattern recognition, and intelligent agents. One may distinguish between data and knowledge by defining data as corresponding to real world observations, being dynamic and quite detailed, whereas knowledge is less precise, is more static and deals with generalizations or abstraction of the data [2]. A number of terms have been used in place of data mining, including information harvesting, data archaeology,

Similar Documents

Premium Essay

Data Minig

...com/locate/techsoc Data mining techniques for customer relationship management Chris Rygielski a, Jyun-Cheng Wang b, David C. Yen a,∗ a Department of DSC & MIS, Miami University, Oxford, OH, USA b Department of Information Management, National Chung-Cheng University, Taiwan, ROC Abstract Advancements in technology have made relationship marketing a reality in recent years. Technologies such as data warehousing, data mining, and campaign management software have made customer relationship management a new area where firms can gain a competitive advantage. Particularly through data mining—the extraction of hidden predictive information from large databases—organizations can identify valuable customers, predict future behaviors, and enable firms to make proactive, knowledge-driven decisions. The automated, future-ori- ented analyses made possible by data mining move beyond the analyses of past events typically provided by history-oriented tools such as decision support systems. Data mining tools answer business questions that in the past were too time-consuming to pursue. Yet, it is the answers to these questions make customer relationship management possible. Various techniques exist among data mining software, each with their own advantages and challenges for different types of applications. A particular dichotomy exists between neural networks and chi-square automated interaction detection (CHAID). While differing approaches abound in the realm of data mining, the...

Words: 8031 - Pages: 33

Free Essay

Data Minig

...Name of the task: Arrange a field trip for member of local community center to MFA (Museum of Fine Arts) Start Date: April 01, 2015 Completion date: June 30, 2015 Person assigned to the task: Completion Status: Open Estimated time: 3 month (Until start the tour) Steps to be taken: 1. Plan the trip. (April 1st) 2. Look for Museum fine and arts. (at least 2-3) (April 2nd-3rd) 3. Compare and review in community administration and select one. (April 3rd-6th) 4. Contact with the authority and get the confirmation when they are available for a big community visit. (April 7th) 5. Fixed the date for trip with Museum. (7th april) 6. Estimate budget for the trip. (April 8th-12th) 7. Advertise the trip encourage the members to participate in it. (April 13th-May 12th) 8. Start the registration process. (April 13th- until June 12th) 9. Fees collection time starts. (Same time with registration- just 1 week extra until 18th june) 10. Head count. 11. Arrange transportation, contact with bus services and book for the day of trip. (19th june) 12. Hire Guides who are well knowledgeable about Fine arts and museum. (20th june but for only trip day) 13. Arrange a get to gather meeting with all the registered members. (25th June) 14. The day of trip. (30th june) For planning a field trip for local community center members, first we need to choose a Museum of Fine Arts. For that we are going to select 2,3 Museum...

Words: 807 - Pages: 4

Free Essay

Dangers of Strava-Fications

...front tire is quickly going flat, and without a spare tube on me, it’s going to be at least an hour walk back to the car. So, what’s the first thing I do? Do I assess my injuries? Get off the trail so the next rider doesn’t run me over? Nope. I pull out my smashed iPhone, cutting my finger on the broken screen to hit “pause” on Strava. No way will I have my time bloated by this idleness. How can I possibly break into the top 10 on the leader board for this segment that way? And once I make top 10—well, King of the Mountain (KoM) is just around the corner… My poor Moots Strava—for those of you who don’t know—is a social fitness application that allows bikers and runners to share, compare, and compete with each other’s personal fitness data. The application lets you track your rides and runs via your iPhone, Android, or GPS device to analyze and quantify your performance and match it against people inside and outside your social circle. I got hooked because Bruce (our director of PM) is hooked. And he’s hooked because Ryan, Tim, and Chris (our developer, QA manager, and lead engineer respectively) are hooked. My Strava Dashboard I’ve been riding the trails at Valley Green for years. Previously, I’ve always seen it as a 17-mile loop with a that you can ride clockwise or counterclockwise. I kept track of my time, focusing on how quickly I could complete the entire loop and how well I was handling the technical challenges. But with Strava, I now see the trails as a series...

Words: 1045 - Pages: 5

Premium Essay

Project Manager

...Follow-on Activity Qualitative Risk Analysis Purpose: Use this follow-on activity to perform a data quality assessment with data of previous projects. Instructions for use: To use this tool, gather historical and lessons-learned risk data from past projects. Then review its accuracy and relevance by evaluating it against a set of listed criteria. Does the data fulfill the criteria or not? Provide a reason. Finally, conclude whether your findings give you confidence in the data. Document your answers in the tables provided. Evaluate how complete the data is |Completeness | |Question |Yes/No |Reasons                        |Conclusions                       | |Is the data complete? |Row 2 Column 2 [pic] |Row 2 Column 3 [pic] |Row 2 Column 4 [pic] | |Are charts graphics, and tables completely |Row 3 Column 2 [pic] |Row 3 Column 3 [pic] |Row 3 Column 4 [pic] | |filled in? | | | | For online use, complete each row as described in the instructions. If you would like to work with the page as hard copy, simply print it out using the Print link at the top of this page. Evaluate the data's clarity |Clarity...

Words: 530 - Pages: 3

Free Essay

Itm501

...Derrick Chapman Jr. ITM501- Module I Case November 11, 2013 In review of my position on information overload, there would be no such overload if avenues such as the various social media outlets, informative readings with little or no credibility, and networking forums with no proven success records were not so heavily relied upon within organizations. The course background readings shed light on how social media is hindering the notions of the Data, Information, Knowledge, and Wisdom. Data is defined as unprocessed information, while information is data that has had a chance to be processed, and finally knowledge and wisdom is something that can be reflected upon (Green, P. 2010). If you are constructively processing the data that you are receiving you will be come a learning organization, possessing the attributes of knowledge and wisdom. A learning organization will be taught through experience or simply stated trial and error. Learning will maximize innovation, effectiveness, and performance, and this knowledge should be spread throughout the organization creating a very reliable, proven, and stable structure. From a personal perspective if your organizations structure is designed to support and manage information there should be no overload. There are endless consequences to information overload, especially when the overload is at the hands of social media technologies. Most of the technologies were designed with the expectations...

Words: 899 - Pages: 4

Free Essay

Differential Manchester Encoding

...transition at the start of the bit if the data is a logic ‘0’ Note: Tanenbaum has a transition for a logic ‘1’ instead. 2. There is always a transition in the middle of the bit. 3. The direction of the transition is immaterial (hence there are two possible waveforms for any data stream depending upon the initial conditions). This gives us the following sample test data assuming pairs of logic levels for one actual bit: Data 1 1 0 0 1 0 1 1 Differential 01 10 10 10 01 01 10 01 Manchester (1) Differential 10 01 01 01 10 10 01 10 Manchester (2) After Halsall 2. Design Steps The output is toggling which suggests a flip flop. If Data = ‘0’ Output = Clock or inverse clock If Data = ‘1’ Output = 2 on –ve clock or inverse 2 on –ve clock Hence the output must be made up of two AND gates and an OR gate to select either * clock or inverse clock when Data = ‘0’ or * 2 on –ve clock or inverse 2 on –ve clock if Data = ‘1’ By De Morgan’s theorem (A.B)+(C.D) = (A.B).(C.D) So we can use three 2 input NAND gates instead of two 2 input AND gates and one 2 input OR gate. Finally we need to flesh out the additional circuitry required. 3. Test Data set The actual test data needs to be more along the following lines: Data 0 0 0 0 0 0 0 Output 10 10 10 10 10 10 10 01 01 01 01 01 01 01 Data 1 1 1 1 1 1 1 Output 01 10 01 10 01 10 01 10 01 10 01 10 01 10 Data 0 1 0 1 0 1 0 Output 10 01 01 10...

Words: 429 - Pages: 2

Premium Essay

Research Map - Cert Perf

...global sense. My objective is to benchmark industries with respect to the data shared between business partners and business to business transactions. The Food industry is not known as the leader in Customer – Vendor data sharing, so my research will first seek to define the leading industry and then define what characteristics separate the leaders. The secondary research will result in the following outputs. • Journal article summaries, with citations, of relevant information • Book chapters or segments that establish an academic foundations for the B to B interactions including relevant history and future expectations • A repository of my findings to share with my cohort 1. Research the current world class state-of-the-art in customer service. a. Define the world class quality reporting (WCQR) and service currently available i. By Industry segment ii. Include Depth of disclosure, delivery timing, iii. Business to Business commitment to achieve WCQR iv. Define WCQR Customer satisfaction and service levels v. Find, interview and evaluate the best companies, as possible ← Phase 2: Primary Data Collection – September to October 2010 Primary research will include Farmland Foods stakeholders and key Customer’s chosen to participate. Research will be conducted various methods that will be defined and changed to fit the environment as the data collection progresses. 1. Conduct an environmental scan within Farmland...

Words: 1049 - Pages: 5

Free Essay

Student

...[pic] [pic] Data Loss and Misuse [pic] [pic] [pic] [pic] [pic] [pic] [pic] [pic] [pic] [pic] Question: The service provider shall provide Client Based Data Leakage Services necessary to provide services and support for Data Loss Protection (DLP) with the following activities: a) Deploy the Clinet endpoint agent (XEA) to all new client machines. b) Deploy the XEA to at least 95% of existing in-scope client machines within 90 days of its initial release. c) Deploy any patches or updates to the XEA out to 95% of existing XEA-equipped machines (both clients and servers) within 45 days of those patches or updates being released from testing with approval to deploy. d) Monitor, investigate and remediate instances where the XEA ceases to function on any machine (client or server) that is still connecting to the XGI. e) Monitor, initiate investigation, and escalate alerts generated by the DLP system indicating mishandling of Clinet classified data. f) Distribute reports and data extracts as required. g) Support Tier I and II help-desk end-users’ and server application support questions arising from the XEA. Can you meet this requirement? Please explain below. ORGANIZATION understanding of Requirements: Clinet is looking for Client Based Data Leakage Services necessary to provide services and support for Data Loss Protection (DLP)...

Words: 1129 - Pages: 5

Premium Essay

Integrated Info Management

...management External data and information considerations consists of four external factors that are economic, sociological, political and technological. Economic factor consists of funding sources, contributors, consumers and competitors. Sociological factors include the local community where the agency functions. Political factors are all the regulatory and accrediting bodies including the agencies board of directors. The technological domain is about all the areas an agency needs to improve regarding technological advancements. All four domains must be kept in check and any questions that may come up need to be addressed so that the agency will have the necessary information when it is needed. Internal data and information considerations consists of organizational purpose, organizational planning, organizational operations, human resources, technological resources, and financial resources. The first three domains have to do with the vision of the agency, reviewing the short term, and long term plans for the agency and the everyday expectations of agency and what data will be needed for the purpose, planning and operations of the agency. Human resources domain is about what data or information is needed regarding employees of the agency. What data is to be tracked regarding employees licensing, certifications, trainings, health information. Technological resources domain is about the agency finding new technology to keep, track and modify data and information. This domain...

Words: 289 - Pages: 2

Premium Essay

Data

...Discuss the importance of data accuracy. Inaccurate data leads to inaccurate information. What can be some of the consequences of data inaccuracy? What can be done to ensure data accuracy? Data accuracy is important because inaccurate data leads may lead to such things as the closing down of business, it may also lead to the loosing of jobs, and it may also lead to the failure of a new product. To ensure that one’s data is accurate one may double check the data given to them, as well as has more than one person researching the data they are researching. Project 3C and 3D Mastering Excel: Project 3G CGS2100L - Section 856 MAN3065 - Section 846 | | 1. (Introductory) Do you think Taco Bell was treated fairly by the mass media when the allegations were made about the meat filling in its tacos? I think so being that they are serving the people for which I must say that if you are serving the people then it’s in the people rights to know what exactly you are serving them. 2. (Advanced) Do you think the law firm would have dropped its suit against Taco Bell if there were real merits to the case? It’s hard to say but do think that with real merits it would have changed the playing feel for wit real merits whose the say that Taco Bell wouldn’t have had an upper hand in the case. 3. (Advanced) Do you think many people who saw television and newspaper coverage about Taco Bell's meat filling being questionable will see the news about the lawsuit being withdrawn? I doubt that...

Words: 857 - Pages: 4

Free Essay

School

...Students that struggle with reading in school is not a new problem. This has been a challenge for teachers for years and continues to be an issue in school systems nationwide. As stated in video program five, “While a child’s development may be delayed, the developmental pattern will remain the same.” (Bear, 2004 ). This really lets school officials know that these students are reachable, but the teachers need to provide appropriate instruction for the student’s developmental level. There are several things to be considered such as: grouping, type of instruction, spelling words, and vocabulary.      Teaching special education, it seems that my students are usually grouped in the teacher/child ratio. Within those small groups there are a variety of reading levels and adjustments that have to be made. We have reading groups everyday in my classroom. My students along with my teammates students are grouped according to ability. Even with that type of grouping remediation for some students is needed because of their rate of progress. It was stated in video program five that struggling readers need repetition, practice, and explicit instruction (Bear,2004). I try to provide this through different modalities. One strategy is a computer program called Intelli-talk.      This is a program similar to Cowriter or Write Out Loud. It allows information to be inputted into the system by the teacher and it will orally read the directions, any...

Words: 654 - Pages: 3

Premium Essay

History of Ais

...different business functions, organizations had to develop complex interfaces for the systems to communicate with each other. In ERP, a system such as accounting information system is built as a module integrated into a suite of applications that can include manufacturing, supply chain, human resources. These modules are integrated together and are able to access the same data and execute complex business processes. With the ubiquity of ERP for businesses, the term “accounting information system” has become much less about pure accounting (financial or managerial) and more about tracking processes across all domains of business. [edit]Software architecture of a modern AIS A modern AIS typically follows a multitier architecture separating the presentation to the user, application processing and data management in distinct layers. The presentation layer manages how the information is displayed to and viewed by functional users of the system (through mobile devices, web browsers or client application). The entire system is backed by a centralized database that stores all of the data. This can include transactional data generated from the core business processes (purchasing, inventory, accounting) or...

Words: 2186 - Pages: 9

Free Essay

Uses of Statistical Data

...delivery of health care. This is particularly true as it relates to the cost of providing health care services (Eaton, 2006). At Mercy Medical Center, not unlike any other health care facility, the use of statistics is pervasive throughout the organization. First and foremost Mercy uses statistics to develop and maintain its financial imperatives (Minnis, 2008). Simply stated if actual cost of providing health care services exceeds the revenue generated the organization will have difficulty keeping its doors open. This paper will discuss examples of descriptive and inferential statistics in use at Mercy Medical Center. Also discussed will be how data at nominal, ordinal, interval, and ratio levels of measurement are used within the organization. Finally, the advantages of accurate interpretation of statistical data and improved decision making within the organization will be discussed. Descriptive Statistics An example of a descriptive statistic used at Mercy Medical Center is time spent by the Emergency Department on yellow alert status. Yellow alert is defined by the Maryland Institute for Emergency Medical Services Systems (2012) as ambulance diversion from a designated emergency department that is unable to effectively manage additional patient volume at that time. Several years ago it became an organizational imperative to minimize and indeed eliminate barriers for patients accessing health care services provided by Mercy Medical Center. Yellow alert...

Words: 917 - Pages: 4

Premium Essay

Statistics in Psychology

...descriptive and inferential statistics, as well as, introduce some key terms that are frequently used. It will also describe the functions of statistics and describe how they are applied in the field of psychology. Having a better understanding of the various statistical functions and definitions, we will have a better opportunity at providing examples and prove that statistics is more than just colorful charts and graphs. Statistics is where a large amount of data is put together in a format that allows the viewer to understand it better. Whenever choosing an experiment that results in statistics, one would start with a hypothesis, or idea. This gives the entire process a purpose. The function of statistics appears for various reasons. When there is a large amount of data, it organizes it so that a viewer and/or a presenter can comprehend or present it easier. A way that it is organized is through charts and graphs, which shows the clarity. Another function is to show comparisons between two or more clumps of data. Statistics helps in forecasting trends and tendencies. Statistical techniques are used for predicting future values and variables. An example of this could be a producer forecasting for a future production. If he created a set of numbers based on his past experiences and compared to his present demand conditions, he could get a better idea for the future. Similarly, city, state, and federal planners can forecast future increases, or decreases, in population....

Words: 745 - Pages: 3

Premium Essay

Business Decision Making

...London Churchill College | BTEC Higher National Diploma (HND) in Business | Business Decision Makingby Edina TosokiTutor: Rahaman Hasan | | LETTER OF TRANSMITTAL 29th of November 2013 Dear Mr. Rahaman Hasan, Enclosed is a formal report for your attention on the subject of Kellogg’s case analysis as per requested in September 2013 to analyze the market response to Kellogg’s products in the UK compared to the historical data of response in India focusing on the failed launch. This report includes introduction, literature, methodology, findings and analysis and finally a conclusion and recommendation section to make clear each step of the process. All data that we collected, organized and analyzed have been presented in charts and graphs for the better understanding, then a final presentation was produced to communicate the whole process through visualization. The workload that this formal report has been based on was assigned both to small groups as well joint class work, however this particular report mainly based on my individual input. During the whole preparation of this report I have tried to stay objective and record accurate information as to the best of my knowledge. Some sections of this report may reflect my own conclusions, suggestions and justifications relating to the subject. Thank you for your time reading, marking my report and giving the opportunity to learn and develop new skills by your guidance. Yours sincerely, Edina Tosoki ...

Words: 5712 - Pages: 23