Design & Build a Microsoft Office Access Database

May 12, 2017 | Author: Mark Gregory | Category: N/A
Share Embed Donate


Short Description

Download Design & Build a Microsoft Office Access Database ...

Description

Designing and Building Access Database Systems Edition 7, October 2009

Mark Gregory ___________________________________________________________________________

Page 1 of 150

Designing and Building Access Database Systems Mark Gregory École Supérieure de Commerce de Rennes ESC Rennes School of Business, France Previously,

Edition 0

School of Computing and Engineering, University of Huddersfield

March 2000

Edition 7

ESC Rennes School of Business, France

October 2009

Work on this document started when Mark Gregory was working at the University of Huddersfield, UK in 1999/2000. Some of the material included was originally written by my then colleague Dr. Steve Wade, who is still at Huddersfield. Other portions draw on work by Dr. Ken Lunn, who has moved on to the IT Directorate of the UK National Health Service. The Sixth Edition was a very substantial revision of the previous year’s edition, as is this Seventh Edition.

Page 2 of 150

DESIGNING AND BUILDING ACCESS DATABASE SYSTEMS..............................2 MARK GREGORY.......................................................................................................2 ÉCOLE SUPÉRIEURE DE COMMERCE DE RENNES..............................................2 ESC RENNES SCHOOL OF BUSINESS, FRANCE...................................................2 PREVIOUSLY, .............................................................................................................2 Edition 0..................................................................................................................................................................2 School of Computing and Engineering, University of Huddersfield................................................................2 March 2000.............................................................................................................................................................2 Edition 7..................................................................................................................................................................2 ESC Rennes School of Business, France..............................................................................................................2 October 2009...........................................................................................................................................................2

1. INTRODUCTION: WHO IS THIS DOCUMENT FOR?...........................................13 1.1. Preface .............................................................................................................................................................13 1.2. Skills required................................................................................................................................................13 1.3. The aims of the remainder of this document .............................................................................................13 1.4. The structure of this document and how to use it......................................................................................14 1.5. About Learning Access..................................................................................................................................15 1.5.1. Starter: the naïve user: Level 1................................................................................................................15 1.5.2. The thinking user: Level 2.......................................................................................................................15 1.5.3. The competent (“power”) user: Level 3..................................................................................................15 1.5.4. Advanced: the programmer or systems integrator: Level 4....................................................................15 1.5.5. What is the relevance of MS Access skills?............................................................................................15 1.6. Testing yourself..............................................................................................................................................15 1.6.1. What you should already know................................................................................................................16 1.6.2. What you should already be able to do or need now to learn.................................................................16 1.6.3. A checklist of more advanced skills........................................................................................................17 1.6.4. Further help on learning to learn..............................................................................................................17 1.7. Conventions Used in this document.............................................................................................................17 1.8. Limitations......................................................................................................................................................17 1.9. Acknowledgements........................................................................................................................................18

SECTION 1 - THE PRINCIPLES OF DATABASE....................................................19 Page 3 of 150

2. A BRIEF INTRODUCTION TO DATABASES........................................................19 2.1. Databases (bases de données) and how they are designed.........................................................................19 2.2. Models used by systems analysts..................................................................................................................19 2.3. Simple model of data processing..................................................................................................................19 2.4. Why study Databases?...................................................................................................................................20 2.4.1. They are used in every significant BIS....................................................................................................20 2.4.2. They are at the heart of............................................................................................................................20 2.5. Background....................................................................................................................................................20 2.6. How to store data...........................................................................................................................................20 2.7. What is a database?.......................................................................................................................................21 2.8. Why keep data in different tables?..............................................................................................................21 2.9. Learning a minimum about database..........................................................................................................23 2.10. Basic concepts...............................................................................................................................................23 2.11. Entity (type, class)........................................................................................................................................24 2.12. An example: Students by Programme.......................................................................................................25 2.13. Attribute........................................................................................................................................................25 2.14. Primary and Foreign Keys.........................................................................................................................25 2.15. Why have foreign keys? .............................................................................................................................26 2.16. Entity occurrences – student......................................................................................................................26 2.17. Entity occurrences – programme...............................................................................................................26 2.18. Queries..........................................................................................................................................................26 2.19. Students and Modules.................................................................................................................................27 2.20. Three Vital Rules.........................................................................................................................................27 2.21. Resolving Many-to-Many Relationships...................................................................................................27 2.22. Towards a more complete entity relationship attribute model..............................................................30 2.23. Example Query – definition........................................................................................................................31 2.24. Example Query – results.............................................................................................................................31 2.25. What is a query? (French ‘une requête’).........................................................................................................................................32 2.26. Data Dictionary............................................................................................................................................32 2.27. Entity-Relationship Diagrams (ERD)........................................................................................................32 2.28. Another example database: NorthWind / Les Comptoirs.......................................................................32 Page 4 of 150

2.29. Relationship: meaning and characteristics...............................................................................................33 2.30. Why it’s important to relate entities..........................................................................................................34 2.31. Degree of relationship (simplified).............................................................................................................34 2.32. Simple database design................................................................................................................................34

3. A SIMPLE METHODOLOGY FOR DESIGNING MICROSOFT ACCESS DATABASES..............................................................................................................35 3.1. Introduction – background to the methodology.........................................................................................35 3.1.1. Database, or something else?...................................................................................................................35 3.1.2. What is a methodology?...........................................................................................................................35 3.1.3. Assumptions..............................................................................................................................................36 3.1.4. Introduction to modelling business information systems: why we have chosen certain techniques.....36 3.1.5. What we’re trying to achieve together.....................................................................................................37 3.1.6. Business Process Modelling: Documenting a Business Process.............................................................37 3.1.7. Why have we chosen the techniques we have?.......................................................................................37 3.1.8. Business Process Modelling.....................................................................................................................38 3.1.9. SSADM.....................................................................................................................................................38 3.1.10. MERISE..................................................................................................................................................39 3.2. Feasibility study.............................................................................................................................................39 3.3. Set out Project Terms of Reference.............................................................................................................40 3.4. Analyse the needs of users.............................................................................................................................40 3.4.1. Identify business processes using a high-level Use Case diagram..........................................................40 3.4.2. Identify detailed requirements for a process to be computerised: carry out Process Modelling...........40 3.5. Decide the purpose and basic contents of the database – Data Modelling..............................................41 3.5.1. Basic Constructs of ER Modelling...........................................................................................................41 3.5.2. Deciding entity types................................................................................................................................41 3.5.3. Entities......................................................................................................................................................41 3.5.4. Relationships.............................................................................................................................................42 3.5.5. Fields: What are the attributes of each entity?........................................................................................43 3.5.6. Data type: Domain....................................................................................................................................44 3.5.7. Identify Domains......................................................................................................................................44 3.5.8. Classifying Relationships.........................................................................................................................45 3.5.9. Keys: primary and secondary (“foreign”)................................................................................................46 3.5.10. Normalisation.........................................................................................................................................48 3.5.11. ER Notation............................................................................................................................................49 3.5.12. Online tutorial.........................................................................................................................................50 3.5.13. DFDs and ERDs – why both? How are they linked?............................................................................51 3.5.14. Why BOTH Data and Process models?.................................................................................................51 3.6. Cross-check: entity life history.....................................................................................................................51 3.6.1. Cross-check DFD and ERA......................................................................................................................51 3.6.2. Time dimension........................................................................................................................................51 3.7. Model User System Interactions ..........................................................................................................52 3.8. Define required outputs: reports, forms, queries.......................................................................................52 3.9. How will Input / Update be carried out (Forms etc.)? .............................................................................52 3.10. Work through your design on paper, whiteboard, etc............................................................................52 3.11. Implementing processes in Access.............................................................................................................52 Page 5 of 150

3.11.1. System data processing.........................................................................................................................53 3.12. Define (“design” in Access terms) the database: Build a prototype......................................................53 3.13. Refine / iterate / implement........................................................................................................................53 3.14. Test the database..........................................................................................................................................53 3.15. Obtain User Feedback.................................................................................................................................53 3.16. Refine the system by Iteration....................................................................................................................54

4. PUTTING DATABASE DESIGN THEORY INTO PRACTICE...............................54 4.1. Design aids......................................................................................................................................................54 4.2. An Exercise.....................................................................................................................................................54 4.3. Achieving real competence in Database Design.........................................................................................55 4.3.1. Documented scenarios..............................................................................................................................55 4.3.2. Suggested but undocumented scenarios...................................................................................................56 4.3.3. Further study.............................................................................................................................................56

5. MORE ABOUT DATABASES................................................................................56 5.1. What is a database?.......................................................................................................................................56 5.2. The history of databases................................................................................................................................57 5.3. Implementing data models in MS Access....................................................................................................57 5.4. What is a database management system?...................................................................................................58 5.5. First challenge: database design..................................................................................................................58 5.6. Second challenge: database implementation..............................................................................................59 5.7. An inductive approach..................................................................................................................................59 5.8. What Is a Database?......................................................................................................................................59 5.9. What Is a DBMS?..........................................................................................................................................60

SECTION 2 – USING MICROSOFT ACCESS TO BUILD GOOD DATABASES....61 6. INTRODUCTION TO MICROSOFT ACCESS........................................................61 6.1. What is a database management system?...................................................................................................61 6.1.1. Software which manages a database........................................................................................................61 6.1.2. Implements entities as tables, maintaining and enforcing relationships.................................................61 6.1.3. Deals with all the component disc files...................................................................................................61 6.1.4. Provides functions such as.......................................................................................................................61 6.1.5. An approachable programming language................................................................................................61 6.2. Important facilities of more advanced DBMS............................................................................................61 6.3. Further facilities of more advanced DBMS................................................................................................62 Page 6 of 150

6.3.1. Other RDBMS..........................................................................................................................................63 6.4. Why we want business students to learn Access.........................................................................................63 6.4.1. The relative ease-of-use of MS Access....................................................................................................64 6.4.2. MS Access is easily obtained...................................................................................................................64 6.4.3. MS Access supports usable programming languages..............................................................................64

7. MS ACCESS IMPLEMENTATION OF DATA MODELS........................................64 7.1. Tables, one per entity type............................................................................................................................64 7.2. Fields, one per attribute................................................................................................................................64 7.3. Records, one per entity occurrence.............................................................................................................64 7.4. Attribute types in MS Access........................................................................................................................65 7.5. Permitted data types in MS Access..............................................................................................................65 7.5.1. Use of Number or Currency fields...........................................................................................................67 7.5.2. Storing telephone numbers.......................................................................................................................67 7.5.3. Controlling data entry formats with masks..............................................................................................67 7.6. Keys ................................................................................................................................................................67 7.6.1. Candidate keys..........................................................................................................................................67 7.6.2. Primary key...............................................................................................................................................67 7.6.3. Multi-part primary keys............................................................................................................................68 7.6.4. Entity integrity rule..................................................................................................................................68 7.6.5. Foreign keys..............................................................................................................................................68 7.7. Relationships..................................................................................................................................................68 7.7.1. Relationships and linking: Enforcing referential integrity where appropriate.......................................70 7.8. System outputs...............................................................................................................................................70 7.8.1. Queries......................................................................................................................................................70 7.8.2. Reports......................................................................................................................................................70 7.8.3. Forms........................................................................................................................................................70 7.9. System inputs..................................................................................................................................................71 7.9.1. Forms, sub-forms and their use with 1: M and M: N relationships.........................................................71 7.9.2. Field-specific validation checks...............................................................................................................71 7.9.3. Using relational integrity to carry out inter-table validation checks......................................................71 7.9.4. Table-level checks on forms....................................................................................................................72 7.10. Implementing processes..............................................................................................................................72 7.10.1. Data processing in Access......................................................................................................................72 7.10.2. Functional elements in Access...............................................................................................................72 7.11. System data transformations......................................................................................................................73 7.11.1. Append and Update queries...................................................................................................................73 7.11.2. Macros....................................................................................................................................................73 7.11.3. Visual Basic for Applications (VBA) modules inside Access..............................................................73 7.11.4. Visual Basic programs outside Access..................................................................................................73

8. WAYS IN WHICH TO LEARN MORE MS ACCESS..............................................74 8.1. Sample databases and applications included with Microsoft Access.......................................................74 8.1.1. NorthWind Traders sample database (English edition) / Les Comptoirs (édition française)......................................................................................................................74 8.1.2. Database Wizards (Assistants).................................................................................................................74 Page 7 of 150

SECTION 3 – THE ANYTOWN DISTANCE LEARNING BUSINESS SCHOOL EXAMPLE...................................................................................................................75 9. EXAMPLE SCENARIO: ANYTOWN DISTANCE LEARNING BUSINESS SCHOOL.....................................................................................................................75 10. BACKGROUND: STUDYING...............................................................................75 11. A CLOSER LOOK INTO "MANAGING STUDENTS".........................................75 12. THE PROCESS OF DECIDING WHAT HAPPENS TO STUDENTS...................76 13. COURSE REVIEW................................................................................................76 14. SIMPLIFYING ASSUMPTIONS............................................................................77 15. EXTERNAL ENTITIES..........................................................................................77 16. PROCESSES........................................................................................................77 16.1. Process Applicants.......................................................................................................................................77 16.2. Admit students to Course – Course Enrolment........................................................................................77 16.3. Register students on core and optional modules......................................................................................77 16.4. Teach and assess a module..........................................................................................................................77 16.5. Prepare for and hold exam board (jury)...................................................................................................77 Collect together the results for all students for all modules they have been studying....................................77 Review module results in exam board..............................................................................................................77 Decide student status in exam board.................................................................................................................77 16.6. Review Course..............................................................................................................................................77

17. DOCUMENTS.......................................................................................................78 17.1. Course Description......................................................................................................................................78 17.1.1. List of Modules.......................................................................................................................................78 17.2. Management Reports..................................................................................................................................78

18. ENTITY AND ATTRIBUTE LISTS........................................................................78 19. EXAMPLE STUDENT RECORD REPORT..........................................................79 20. ANYTOWN HIGH-LEVEL USE CASE DIAGRAM...............................................80 21. ANYTOWN: CONTEXT DIAGRAM......................................................................81 Page 8 of 150

22. LEVEL 1 DFD.......................................................................................................82 23. EXAMPLE LEVEL 2 DFD.....................................................................................83 24. DATA DICTIONARY.............................................................................................84 24.1. Data dictionary for Anytown Business School..........................................................................................84

25. ANYTOWN ER DIAGRAM....................................................................................96 26. ANYTOWN SYSTEM IMPLEMENTATION .........................................................97 27. TERMINOLOGY ASSOCIATED WITH DATA MODELLING AND DATABASE DESIGN.......................................................................................................................97 28. REFERENCES......................................................................................................98 28.1. Basics of structured analysis.......................................................................................................................98 28.2. Database theory............................................................................................................................................98 28.3. DataFlow Diagrams (DFDs).......................................................................................................................98 28.4. Entity relationship modelling.....................................................................................................................98 28.5. Use Case .......................................................................................................................................................98 28.6. Basics of Object Oriented Analysis and Design (OOAD)........................................................................99

29. APPENDIX 1 BUSINESS PROCESS ANALYSIS USING USE CASE ANALYSIS................................................................................................................100 29.1. What is a Use Case Diagram?..................................................................................................................100 29.2. What to do if a use case diagram won’t fit on a single page?...............................................................103 29.3. Finding Use Cases......................................................................................................................................103 29.4. Naming Use Cases......................................................................................................................................104 29.5. Describing Use Cases.................................................................................................................................105 29.6. Using Use Cases to identify System Inputs and Outputs.......................................................................105 29.7. Other resources for learning about Use Cases.......................................................................................105

30. APPENDIX 2 DATA FLOW DIAGRAMS..........................................................106 30.1. What are Data Flow Diagrams (DFDs)?.................................................................................................106 30.2. Why use Data Flow Diagrams?................................................................................................................106 30.3. What is a DFD? Main elements................................................................................................................107 Page 9 of 150

30.4. The components of a DFD.........................................................................................................................107 30.5. What appears on a DFD?..........................................................................................................................108 30.5.1. Listing the elements of a DFD.............................................................................................................109 30.6. The Data Flow Diagram Symbols – SSADM Notation..........................................................................109 30.7. Making a Data Flow Diagram: a Top-Down Approach........................................................................109 30.8. The elements of a DFD..............................................................................................................................110 30.9. Creating DFDs............................................................................................................................................110 30.10. First List the Elements of the Data Flow Diagram..............................................................................112 30.11. Drawing the Context Diagram...............................................................................................................112 30.12. Expanding a context diagram to give a level 1 DFD............................................................................112 30.13. Questions to ask yourself........................................................................................................................113 30.14. Rules for DFDs.........................................................................................................................................113 30.15. Some points on logical DFDs..................................................................................................................113 30.16. Supporting documentation.....................................................................................................................113 30.17. Summary: “levelled” DFDs....................................................................................................................114

31. APPENDIX 3 WHEN TO USE A SPREADSHEET, AND WHEN TO USE A DATABASE...............................................................................................................115 31.1. Introduction................................................................................................................................................115 31.2. Spreadsheets versus databases.................................................................................................................115 31.2.1. What spreadsheets are good at.............................................................................................................115 31.2.2. What databases are better at.................................................................................................................115 31.2.3. Using spreadsheets and database together...........................................................................................116 31.2.4. Summary...............................................................................................................................................116 31.3. What to do if your spreadsheet skills are weak......................................................................................117 31.4. What to do if your database skills are weak...........................................................................................117 31.5. Conclusion..................................................................................................................................................118 31.6. Acknowledgements – bibliography for Appendix 31.............................................................................118

1. APPENDIX 4: REASONS WHY A DATABASE IS TO BE PREFERRED TO A SPREADSHEET - SPREADSHEET DOES NOT EQUAL DATABASE..................119 1.1. More Than a List.........................................................................................................................................119 1.2. Create the Database.....................................................................................................................................120 1.3. Create a Data Entry Form..........................................................................................................................122

Page 10 of 150

2. APPENDIX 5: ACCESS HINTS - DESIGNING FOR USE...................................124 2.1. Getting more help........................................................................................................................................124 2.2. Unlocking the power of many-to-many relationships..............................................................................124 2.3. Some difficulties associated with forms and subforms and how to overcome them............................128 2.4. Subform not updated...................................................................................................................................128 2.5. Detail subform does not show the subset of records based on the value of the current master form record...................................................................................................................................................................130

3. APPENDIX 6: NORMALISATION........................................................................133 3.1. Introduction to Normalisation....................................................................................................................133 3.2. Introduction..................................................................................................................................................133 3.3. Preliminary remarks...................................................................................................................................134 3.4. Terminology.................................................................................................................................................134 3.4.1. Records...................................................................................................................................................134 3.4.2. Field names.............................................................................................................................................134 3.4.3. Keys........................................................................................................................................................134 3.5. The various stages of normalisation..........................................................................................................137 3.5.1. Convert data into unnormalised form (UNF, 0NF)...............................................................................137 3.5.2. Convert UNF into First Normal Form (1NF).........................................................................................137 3.5.3. Convert 1NF into Second normal form (2NF)......................................................................................137 3.5.4. Convert 2NF into Third normal form (3NF).........................................................................................137 3.6. Further normalisation.................................................................................................................................138 3.7. A full example of normalisation.................................................................................................................138 3.7.1. Step 1 - Convert data into UNF..............................................................................................................138 3.7.2. Step 2 - Convert data into 1NF..............................................................................................................139 3.7.3. Step 3 - Convert data into 2NF..............................................................................................................139 3.7.4. Step 4 - Convert data into 3NF..............................................................................................................140 3.8. Normalisation: A Summary........................................................................................................................141 3.9. Normalisation complements top-down entity-relationship modelling...................................................141 3.10. What is achieved by normalisation?........................................................................................................142 3.11. How is normalisation used in practice?..................................................................................................142 3.12. Still confused?............................................................................................................................................142 3.13. Some questions with which to check your understanding.....................................................................142

4. APPENDIX 7 INSTALLING AND USING MICROSOFT VISIO...........................146 4.1. Introduction..................................................................................................................................................146 4.2. Visualize complex information to better understand it...........................................................................146 4.3. Learning Visio..............................................................................................................................................147 Page 11 of 150

4.4. Creating DFDs using Visio..........................................................................................................................147 4.5. Installing SSADM support..........................................................................................................................147

5. APPENDIX 8 STRUCTURED WALKTHROUGHS, A WAY TO IMPROVE THE QUALITY OF ANALYSIS.........................................................................................149 5.1. How to seek for perfection! Improving the quality of our work............................................................149 5.2. References for Appendix 8..........................................................................................................................150

Page 12 of 150

1.

Introduction: Who is this document for? 1.1.

Preface The booklet aims to help you learn how to design and build applications using Microsoft Access. This document is written to be read and understood as you are working on your own design and build experiments. This Access database design and implementation document is a higher-level self-instruction booklet; it is assumed that you are already a fairly competent Access user. If you need to learn how to use Microsoft Access, please see section 1.6 for further advice.

1.2.

Skills required Modern relational database management systems such as Microsoft Access have been designed to enable users to get as far as is reasonably possible without needing software design and construction (“programming”) skills. Four basic levels of skill can be recognised in database use. These are:

1.3.



LEVEL 1 – Database User Straightforward data input, amendment and querying, such as might be undertaken by a clerical or professional worker who is expected to capture and use data as a small part of their every day work;



LEVEL 2 – Database Builder Basic database implementation skills, including design of simple databases and implementation of the design as a series of tables, queries and reports; such skills might be anticipated in a professional worker in an office environment who has some responsibility for the basic information systems (IS) needed in that office, but whose primary job responsibility is not IS-oriented;



LEVEL 3 – Database Administrator Real database design and implementation competence. You would expect this in an information systems professional. But this same higher level of competence may also be found in certain business-oriented individuals who take a real pride in using computers to their full potential. Such individuals are sometimes referred to as power users. The work of such an individual includes serving the needs of other clerical and professional office workers by undertaking detailed analysis, design and implementation work and creating systems usable by other office workers and business professionals.



LEVEL 4 – Database Professional Expert user with programming skills.

The aims of the remainder of this document The main aim of asking you to work through the remainder of this document is to link the following topics:



To help you to learn the principles of modelling information systems in an experiential, problem-oriented way and not just a theoretical one. (Corresponds to Level 1 above)



If you are aspiring to general competence in business studies: Page 13 of 150

to give you reasonable skills in the analysis and construction of effective, albeit small-scale, computerised business information systems. (Corresponds to Level 2 above) ♦

If you want really to exploit the power of databases and systems and / or you aspire to the challenge of managing information systems professionals: To help you reach the point where you can analyse a user's requirements, design them a solution, and refine the solution by means of building a working prototype in MS Access. (Corresponds to Level 3 above)



If you are a budding IS professional, or wish to become a systems analyst or consultant: This document is a starting point only – you will need specific additional training and experience. (Corresponds to Level 4 above)

Note that material which only applies to Level 3 or above is shown in greybackground Arial Narrow, like this paragraph.

1.4.

The structure of this document and how to use it This document consists of:





Section 1 – The Principles of Database ∗

Data modelling using ERM



Databases

Section 2 – Using Microsoft Access to build good databases ∗





Section 3 – The Anytown Distance Learning Business School example ∗

A fully worked example of the analysis and design of a system for a virtual enterprise



However, please note that this example does not enter into the business-oriented aspects of the assignment you are doing

Appendix 1 Use Case analysis ∗



System implementation using Microsoft Access

User interaction modelling using Use Case scenarios

Appendix 2 Data Flow Diagrams ∗

Process modelling using DFD data modelling



Appendix 3 When to use a spreadsheet, and when to use a database



Appendix 4 Reasons why a database is to be preferred to a spreadsheet



Appendix 5: Access Hints – Designing for Use



Appendix 6: Normalisation



Appendix 7 Installing and using Microsoft Visio



Appendix 8 Structured Walkthroughs, a way to improve the Page 14 of 150

quality of your analysis All readers should start with Section 1. Then read the rest of the document , but ignoring this kind of text. Later, reread the document including text like this.

1.5.

About Learning Access You should work your way through the following stages:

1.5.1. Starter: the naïve user: Level 1 You should already be at (or perhaps beyond) this stage. If you aren’t – learn how to use Access now! This Designing and Building Access Database Systems guide cannot help you to learn these basic skills, which you are assumed already to have – but may have to acquire, revise and practise them, at the same time as you are reading this booklet. You’ll find a checklist just below, in section 1.6.

1.5.2. The thinking user: Level 2 The business specialist who nevertheless thinks carefully about how s/he can best use a computer to help them to get their work done, or who spots a new application area or ICT-related business opportunity. Working through this document, and using the facilities of each Office programme just a little bit more each time, should get you to about this stage.

1.5.3. The competent (“power”) user: Level 3 This is the person who becomes known as the person to whom to talk when no-one else in the department or office seems to know what to do! This is the person who has mastered spreadsheets and uses them frequently, and who knows when to use a database.

1.5.4. Advanced: the programmer or systems integrator: Level 4 Further competence in Access will require you to begin to use the power of the VBA programming language and to understand SQL. This subject is beyond the scope of this module, and is NOT expected in ESC business students.

1.5.5. What is the relevance of MS Access skills? The main reason for advising business students to learn Access is that it is possible using Access to build reasonably powerful Information Systems (IS) with a tool which is reasonably straightforward (if not always easy!). In effect, you are building a Prototype system using an End User Computing tool. At the same time, you are consolidating what you have been taught in first and second year modules. All this is essential if you are to achieve the learning outcomes of the module that you are studying. Facility in Access is itself a marketable skill. You should find it much easier to obtain certain internships or placements as a result of the fact that you know industry standard software like Access. In addition, Access is a reasonably complete implementation of the theoretical relational database model originally defined by Edgar Codd and popularised by many authors (notably, Chris Date—see Date 2003). Relational databases are a very powerful way to structure data and to be able to get the information you need as a future manager.

1.6.

Testing yourself Page 15 of 150

In this section, we summarise what we consider to be the basic knowledge and ability you need to have in Microsoft Access. The most important first step is to take a first step! Get hold of a copy of Access and start to use it. As you do so, tick off the various things on the list below. You can start reading this Designing and Building Access Database Systems guide in parallel, but please understand that you cannot understand what is in this book without actually testing your practical ability and knowledge.

1.6.1. What you should already know We have already revised or introduced the following concepts:



What are Tables?



Designing a Table



Keys: primary and secondary (“foreign”)



Relationships and linking

1.6.2. What you should already be able to do or need now to learn You should aim at the following practical competences, which you may have acquired in the first year at ESC Rennes, or which you may now need to learn:

Competence

Fundamental skills ∗ Starting Access ∗ Creating a database ∗ Creating a Table ∗ Adding Data ∗ Creating a Query ∗ Adding a second table ∗ Linking tables with a relationship – establishing foreign keys Forms – basic concepts ∗ Creating a form based on a table using the form-building assistant / wizard ∗ Changing the design of the form ∗ Adding records using a form Reports – basic concepts ∗ Creating a report ∗ Creating a report based on a query Relationships ∗ Creating a relationship between tables ∗ Creating a query which uses linked tables ∗ Forms, sub-forms and their use with one-toPage 16 of 150

Tick when you can do this

many (1:M) relationships ∗ Many-to-many relationships and multi-part primary keys Forms – more advanced use ∗ Using list and combo boxes ∗ Combo boxes (zones de liste déroulantes) and subforms (sous-formulaires) ∗ Creating a subform (sous-formulaire) ∗ Inserting a subform into a main form ∗ Subforms of subforms ∗ Adding record navigation buttons

1.6.3.A checklist of more advanced skills If you are aiming at Level 3 or Level 4 competence, you will need to achieve:

Competence

∗ Competence in update and append queries ∗ Advanced data validation techniques ∗ Basic competence in Visual Basic programming ∗ Basic competence in SQL (structured query language) ∗ Dealing with problematic many-tomany relationships

1.6.4. Further help on learning to learn See appendix 1 for a very basic introduction to Microsoft Access, and appendix 31.4 for some suggested websites.

1.7.

Conventions Used in this document Points which are significant only to more advanced users are indicated like this paragraph.

 VITAL POINTS are indicated like this! 1.8.

Limitations This document is aimed at people who are comparatively new to systems analysis and design, and who are not aiming to be experts in that field. It therefore aims to be useful and usable without necessarily being totally complete. Where a conflict exists between being totally comprehensive (but unnecessarily difficult), and being comprehensible and straightforward, the second approach is adopted. The aim is to exclude material which is extraneous in the Page 17 of 150

Tick when you can do this

sense that most business people, and indeed many analysts, do not need to consider it. Therefore complex issues such as ternary relationships are ignored. Instead, the main issues are concentrated on and the reader is encouraged to understand them and to apply them. Once the reader is comfortable with the approach adopted in this document and has achieved some real competence in database design and implementation, he/she can read more advanced texts and tackle the more difficult issues. Until then, the slightly simplified (but never facile) approach adopted in this document is a sensible compromise.

Microsoft Access is very unusual as a database management package in that it is intended both to be very useful to people who are new to database, and also to offer the full power of a programmable system to more advanced users. Microsoft Access aims to make the gap between intermediate and advanced use as small as it can be, because it provides both macros and Visual Basic for Applications (VBA). Macros can be used to automate repetitive sequences of commands or instructions, such as those needed to open a form without having first explicitly to open the database window. VBA is a full programming language, and it can be used for relatively complex tasks such as advanced field validation, and also for dealing with anticipated errors and automatically recovering from them. VBA is a programming language which is based on Microsoft's Visual Basic system. In Office 97 and beyond, the same VBA language is used in all the major Microsoft applications, Word, Excel and Access. Although this document does not assume familiarity with Visual Basic, certain more advanced uses of Access do require awareness of such Visual Basic concepts as functions (sub programs which return a result) and many more advanced features of Access – things like validation rules - use Visual Basic syntax. Business students should NOT normally attempt to master the VBA programming language. However, at certain points in this document, VBA is used to illustrate more advanced techniques.

1.9.

Acknowledgements I should like to thank:



Former Huddersfield colleagues Dr. Steve Wade and Dr. Ken Lunn



ESC Rennes colleagues, notably Dr. Renaud Macgilchrist



Previous ESC Rennes students The following students gave me permission to reuse parts of their excellent work on the Anytown Business School group case. I have incorporated this case as a worked example in this document, and made significant use of these students’ work: Marine CORRE; Marie GALATAUD; Emmanuelle HAMEURY; Naïla MALTI

Page 18 of 150

SECTION 1 - THE PRINCIPLES OF DATABASE 2.

A brief introduction to databases 2.1.

Databases (bases de données) and how they are designed In this chapter, we give consideration to databases: what they are, to some extent how they are used, and to a limited extent how they are designed. The subject matter here involves a specialised vocabulary, and a degree of complexity. Many of the ideas surrounding database are on first encounter quite strange, but they quickly become intuitive if you combine a study of the theory of database with an attempt to make them work in practice. So: stick with that approach, learn a little then try it out!

2.2.

Models used by systems analysts These are examples only!



Interaction models ∗



Process models ∗



DFDs (Data Flow Diagrams)

Data models ∗

2.3.

UCDs (Use Case Diagrams)

ERA (Entity Relationship Attribute diagram)

Simple model of data processing

Sourc e

Data

Data Processing System

Recipien t Information

Store Data

Retrieve Data

Database This diagram, which shows the structure of a data processing system (a synonym for business information system), highlights the central importance of the database as the place where data is stored and from which it is retrieved. Page 19 of 150

2.4.

Why study Databases? 2.4.1. They are used in every significant BIS ♦

Store details of orders, customers etc.



Support product catalogue in B2C applications

2.4.2. They are at the heart of

2.5.



“Database marketing”



CRM – customer relationship management



ERP – enterprise resource planning

Background Some understanding of what a database is, how it is used, and (to a greater or lesser extent) how databases are designed is essential to understanding electronic business. Businesses are systems; they use Information Systems, which are based on Information and Communications Technology. Example: any e-commerce company provides a Web window onto its internal catalogue: which is a web page connected to a database. Every stakeholder needs information from the business. They generally obtain this as information presented on forms (screens), reports and dynamic web-pages (webpages which show the current contents of a database and permit stockholders to update that database).

2.6.

How to store data ♦

Data is stored in tables: 2-dimensional structures



In MS Office terms:





Word tables (also PowerPoint)



Excel worksheets



Access tables

The 2-dimensional table which follows was created in Word

Relative strengths and weaknesses of Word, Excel and Access for storing data Method

Advantages

Disadvantages

Word Simple, well understood by people with weak computing skills Processing: e.g. Word Excellent formatting options

Spreadsheet: e.g. Some degree of structure – cells organised into rows and Excel columns, with links possible between the cells Very powerful data manipulation using formulae Separate tables can be held in different worksheets

Page 20 of 150

No formulae (or only very rudimentary ones) Tables are not related in any way Can only be updated by one person at a time. The data in a table has no “structure” known to the computer. Persistent data is not safe. Size limits – 65535 rows (until Office 2007). No design methodology or coherence – it is possible and easy to mix data up in a way which makes it impossible to

find, update and relate. Items of data can be related together using lookup formulae such Poor support for queries – as VLOOKUP (RECHERCHEV) and HLOOKUP searching is slow, and the lookup (RECHERCHEH) formulae are far from being intuitive. Can only be updated by one person at a time Database: e.g. Each kind of data is stored by the database management system More difficult to use and to learn Access (DBMS) in its own separate table. The tables are related together (at first) in accordance with the Relational data model – this gives coherence to the collection of tables, which is the whole database Very powerful data structuring and querying. In fact a query is Requires thoughtful use and just a results table which combines together selected data from advance planning more than one stored table. The database program enables the user to say what data they need and they construct a query which precisely specifies what data is to be retrieved into the results table Safer persistent data (though less safe than bigger, more powerful Access databases are not directly DBMS programs like Microsoft SQL Server, ORACLE etc) web-accessible Is multi-user: that is, more than one person at a time can change (update) the database Since every record in a table has the same basic structure, it is But the programming language much easier and / or more cost-effective to process complete sets within Access, VBA, is too of records under program control difficult and / or inappropriate for most business users to learn. Figure 1 Comparative strengths and weaknesses of data storage in two dimensional tables: Microsoft Office tools

2.7.

2.8.

What is a database? ♦

A linked collection of tables (tables)



Each table containing data about a single kind of thing



Data in the separate tables can be combined ("joined") to answer user needs for information

Why keep data in different tables? ♦

Company A uses a single table named orders to record orders they receive, while Company B uses a relational database with two tables: orders and customers.



When a customer places an order with Company A, a new record (or row) in the table orders is created.



Because Company A has only one table of data, all the information pertaining to that order must be put into a single record: the customer’s general information, such as name and address, is stored in the same record as the order information, such as product description, quantity, and price. If customers place more than one order, their general information will need to be re-entered and thus duplicated for each order they place.

Page 21 of 150

Order number

Customer name

Customer address

O001 GREGORY Mark 1 La Rue O002 GREGORY Mark 1 La Rue O003 MACGILCHRIST 1 La Croix Renaud O004 GREGORY Mark 1 La Rue O005 MACGILCHRIST 1 La Croix Renaud O006 GREGORY Mark 11 La Rue O007 GOT Guillaume 1 L’Avenue O008 GREGORY Mark 99 Le Chemin

Product code

Product description

P001 P876 P001

Apples Oranges Apples

Unit of sale kg kg kg

P001 P876

Apples Oranges

P001 P001 P876

Apples Apples Oranges

Price Quantit per unit y

Amount

2,5 1 3

2,00 € 0,90 € 2,40 €

kg kg

0,80 € 0,90 € 0,80 € Mistype d € 0,80 address 1,05 €

2 1,5

1,60 € 1,58 €

kg kg kg

0,90 € 0,90 € 0,90 €

2 3 1,5

1,80 € 2,70 € 1,35 €

Changed address



Whenever there is duplicate data, as in the case above, many inconsistencies may arise when users try to query the database. Additionally, a customer’s change of address might require the database manager to find all records in orders that the customer placed, and change the address data for each one.



Company B is much better off with its relational database. Each of its customers has one and only one record of general information stored in the table customers. Each customer’s record is identified by a unique customer code which will serve as the relational key. When a customer orders from Company B, the record in orders need contain only a reference to the customer’s code, because all of the customer’s general information is already stored in customers.



Indeed, Company B might go further and introduce a product table. It then has: ∗

Customer number C001 C002 C003

CUSTOMER table

Customer name

Customer address GREGORY Mark 1 La Rue MACGILCHRIST 1 La Croix Renaud GOT Guillaume 1 L'Avenue



Page 22 of 150

∗ Product code

Product description

P001 P876

Apples Oranges



Unit of sale Standard price per unit kg 0,80 € kg 0,90 €

ORDER table

Order number

Customer number

Product code

O001 O002 O003 O004 O005 O006 O007 O008

C001 C001 C002 C001 C002 C001 C003 C001

P001 P876 P001 P001 P876 P001 P001 P876



2.9.

PRODUCT table

Actual Quantity Amount price per unit 0,80 € 2,5 2,00 € 0,90 € 1 0,90 € 0,80 € 3 2,40 € 0,80 € 2 1,60 € 1,05 € 1,5 1,58 € 0,90 € 2 1,80 € 0,90 € 3 2,70 € 0,90 € 1,5 1,35 €

This still isn’t perfect, since Orders and their Details continue to be mixed together in one table. 1

Learning a minimum about database Every business student needs to know about and understand:



Database principles ∗

Tables



Queries



A query is a results table



Introduction to database design



How to use a sample database



How databases are used - ERP, CRM etc.

2.10. Basic concepts ♦

Entity: class of thing about which data is stored Examples: student; programme – these are tables of data.



Occurrence: a single instance of an entity Example: ETU2004987 Smith, John – this is one record in the table of data.



Attribute: a single fact that describes, qualifies or is otherwise a property of an entity Example: Programme name, value for John Smith: MA International Business



Price p unit i er s on bot h t a One i bles! s stand a other rd, the order specif ic.

Key: attribute(s) which uniquely identify a single occurrence of an entity Example: student number uniquely identifies a Student

1

The solution here includes the introduction of a link or intersection entity, called Order Detail. See section 2.20 for a general description of what must be done. Page 23 of 150



Relationship: a logical connection or dependency between two entities Example: any one programme has many students; any one student is on precisely one programme: we say that a one to many relationship exists between programme and student

2.11. Entity (type, class) An entity represents a class of objects, usually in the real world. Synonyms for entity include class and type.



Entities are of importance to the area of business being investigated



They are objects about which data is stored



Represented as boxes on an Entity Relationship Model (ERM)



Examples: ∗

Student



Programme

An entity has a number of different data attributes or properties, that is, facts about the thing. For example a student will have a student number, a last name, a first name, and a programme code. A programme will have a programme code, name and programme leader.

Page 24 of 150

2.12. An example: Students by Programme Entity relationship diagram

Sample data

Programme PK

Programme code

Programme code Programme name

LMD level

Leader

PGE

Programme Grande École

M

RIVET Philippe

EMBA

Executive MBA

M

MINDAY Don

Programme name LMD level Leader

Student PK

Student no

Student no

Surname

Forenames

Programme code

20099234

Leuchars

Annabelle

EMBA

20099235

Dromsky

Pierre-Charles

PGE

20099897

Mozart

Anne-Marie

PGE

In the diagram, the two rectangular boxes represent entity types. Here, they are programme and student.Surname They are represented as different entity types because they represent different things in the real world. At least in theory, a programme could exist without any students. Forenames Almost by definition, a student is on a programme of some kind, but it is clear that programme and student are notcode the same things. It is equally clear that they are related. The FK1 Programme diagram represents this relationship by using a line with a crow's foot at one end of it. The end of the crow's foot represents the many end of a one to many relationship, often represented simply as 1: M It is necessary to have an additional attribute on the student which links the student to its owning programme. On the sample data provided with the diagram, we have shown Annabelle Leuchars as being a student on the Executive MBA, by including the Programme code in the Student table. Programme code is a foreign key, which links the Student back to her Programme.

2.13. Attribute An attribute is a Property of an entity, a single fact about the entity. An entity type will normally have several different attributes, one (or occasionally more) of which uniquely identifies every instance of the entity type. The identifying attribute or group of attributes is called for the primary key for the entity type. The Attributes of Programme are Programme Code (primary key), Programme Name, and Programme leader The Attributes of Student are Number (primary key), First name, Last name, Programme Code (foreign key) Programme code has to be present as a foreign key in the student entity in order to represent the relationship which exists between programme and student.

2.14. Primary and Foreign Keys Page 25 of 150



Primary key is an attribute or combination of attributes which uniquely identifies an entity occurrence



To make a link between the Many (child) end of a relationship and its One (parent) end, the Primary key of the One end is repeated in the Many end



In the Many entity, it is known as the Foreign Key



What are Foreign Keys? A foreign key is an attribute that completes a relationship by identifying the parent entity. Foreign keys provide a method for maintaining integrity (coherence, consistency) in the data. Every relationship in the data model must be supported by a foreign key.



Identifying Foreign Keys: Every dependent and subtype entity in the data model must have a foreign key for each relationship in which it participates. Foreign keys are formed in dependent and subtype entities by migrating the entire primary key from the parent entity.

2.15. Why have foreign keys? ♦

One particular programme has many students on it. That group of students (we call the group an entity type or table or set – the words are synonyms) is defined by having the same programme code



The programme code of programme is of course the Primary key



It is also, in the student table, the Foreign key which links each student to the programme of which s/he is a part



Remember that the foreign key repeats at the many end, the primary key of the one end of the one to many relationship. This is essential if the database management software is to be able to link back together the students on a given programme, or to look up the details of the programme for a given student.

2.16. Entity occurrences – student Having decided the general attributes of an entity type, it is then possible to store records relating to occurrences of the entity, typically in the real world. So, in the example above, details of three students are given, two on one programme, one on another.

2.17. Entity occurrences – programme In the example above, details of two programmes are given.

2.18. Queries The purpose of a database is to enable users to get the specific information they need. This can be done using queries. Queries are both useful in themselves, and also are used as the basis for reports and for forms.



To answer a question like: who is programme leader for a given student? We can get all the necessary information by a query on both tables - programme and student



Note that the name of the programme leader should be an Page 26 of 150

attribute of programme, and definitely NOT of student! To answer a question like: who is programme leader for a given student? we can get all the necessary information by a query on both tables - programme and student. This is the work of the relational database management system software (RDBMS). A user of the database formulates a query, and the RDBMS goes away to look up details of occurrences in both entity types, joining the answers together as a result presented to the user.

2.19. Students and Modules

Module

Student

In this diagram, the two rectangular boxes represent entity types. Here, they are Module and Student. The relationship is Many-to-Many. The diagram represents this relationship by using a line with a crow's foot at both ends of it. The end of the crow's foot represents the many end of the many to many relationship, often represented simply as M:M or M:N This model reflects the empirical observations that:

1. Any one student studies many modules 2. Any one module has many students Many-to-Many relationships are very common. They are also problematical – this is because actual database management systems like Access (and almost all others) cannot support Many-to-Many relationships directly. However, by following simple rules, it is possible to eliminate many-to-many relationships.

2.20. Three Vital Rules 1. An attribute can only hold a single fact ∗

If the name of an attribute is a list (e.g. something in the plural, like Student Qualifications) this is a sign that another entity is needed

2. The Primary key of the One end of a One to Many relationship also appears as a foreign key in the Many end 3. A Many to Many relationship can be resolved into two One to Many relationships, both going to a Link (or Intersection) entity type These are Rules – there is no need to question them, just to apply them!

2.21. Resolving Many-to-Many Relationships

Page 27 of 150

Resolv Relatio

The many-to-many relationship is removed by:



Introducing a link or intersection entity



Drawing 1-to-many links FROM each original entity TO the new one ∗



Note that the primary key of EACH parent entity becomes part of the COMPOUND primary key of the link entity

Here, the primary key of Module is Module Code, and that of Student is Student No. Both become the compound primary key of the Registration entity

Module PK

Module code Module name Module leader

Module code

Student no

Module result

IS402E

20099234

A

IS402E Module Registration

20099235

B

IS402E

20099897

C

20099234

Fx

PK,FK1 PK,FK2

Module code OB401E Student no Module result

Student PK

Student no Page 28 of 150

Surname Forenames

Module



Note that there is only one primary key, made up of two attributes ∗

Neither Module code nor Student no are unique in the Module registration table – but the combination is unique Unless a student is allowed to do a module a second time, in which case it is necessary to add a further attribute, usually a date, to the compound primary key in order to make it unique again:

Module code

Student no

Date

Module result

IS402E

20099234

2009

A

IS402E

20099235

2009

B

IS402E

20099897

2009

C

OB401E

20099234

2009

Fx

OB401E

20099234

2010

E

Page 29 of 150

2.22. Towards a more complete entity relationship attribute model As analysis proceeds, the model is gradually refined and improved. Still incomplete, it might look like this:

Qualification

Programme PK

PK

Programme code

Qualification

Programme name LMD level Programme leader surname Programme leader forenames

Student Award

PK

Student no

FK1

Student surname Student forenames Programme code Student gender Student birthdate

PK,FK1 PK,FK2

Qualification Student no Award result

Module Registra PK,FK1 PK,FK1 PK,FK2

Module code Module year Student no

Module grade Module mark Module Operation Note that this model has introduced a number PK,FK1 of changes: Module code



PK yearnames semester There is greater precision in Module the attribute chosen



We wish to record a student’sModule qualifications, leader so we have introduced Qualification



Because a many-to many relationship exists between student and qualification, an intermediate (link) entity has been Page 30 of 150

Module PK

Module code

introduced; we observe that in the real world a specific award is give to each student who qualifies in something, so we’ve called the link entity Award ♦

We observe that many modules are offered and “run” (that is, they occur and are taught) for several years in succession (and sometimes in more than one semester in a year), and further that in some cases students take a module one year, fail it, and do it again in a subsequent year; therefore we introduce a Module Operation, the run of a module in a given year and semester



The model remains incomplete but it’s now good enough to be worth prototyping (building and testing) in Access – so that we can check that it meets our needs for storing data and (above all) retrieving information in a very flexible way

2.23. Example Query – definition Suppose we want to show a list of the students and the programme they are on, in ascending order of student last name. The slide is a screen shot indicating how this query is constructed in the Microsoft Access RDBMS.

I created it using the query design wizard (assistant) in Access. In the simple query wizard, I specified fields from the student table and from the programme table which I wished to appear in the result. Here, I wanted a list of students with details of the programme they are following. The screen shot shows the resulting query: it indicates that there are two tables which are joined together in the preparation of the result, and it also indicates which fields take part in the result.

2.24. Example Query – results This slide shows the results of running (executing) the query. The results of the query have the form of a table, in which the columns are the attributes from the participating tables and the rows are the result records.

Page 31 of 150

How has Access created this result? Probably something like this: it reads each record in the student table. One of the attributes of student is the programme code. Programme code is the foreign key in the student table; it is also the primary key in the programme table. Access looks up the details from the programme corresponding to the programme code for each student record. In effect, it joins together the two tables on the basis of the linking foreign key.

2.25. What is a query? (French ‘une requête’) ♦

A query is a response to a question formulated by a user of the database



A query takes the form of a table, in this case, a results table





Technically, data tables are sets or relations



So are the results of a query

The power of a relational database is that it treats all stored data and derived information as what mathematicians call sets (sometimes called relations) ∗

Therefore database software is “simply” a computerised implementation of mathematical set manipulation

2.26. Data Dictionary A store of data about data, a dictionary is a database used by analysts, programmers etc. or by you as you design a database. I find it useful to use a spreadsheet for this purpose, and you will find an example dictionary implemented as a spreadsheet for the Anytown case later in this document.

2.27. Entity-Relationship Diagrams (ERD) Entity-Relationship Diagrams show:



Entity types = the kind of things data is collected about in the database ∗

Entities = the specific things data is collected about



Relationship = the way specific entities of one type are related to specific entities of the other type



Attributes = the specific data items of interest stored for each entity type



ERDs help determine what kinds of data will be included in a database, and how the database will be structured



They are an excellent communication medium between users and developers

2.28. Another example database: NorthWind / Les Comptoirs Page 32 of 150



Provided with Microsoft Access – usually to be found under help menu, Sample databases. It includes table like: ∗

Product



Order



Customer



Supplier



Purchase order

The screen-shot shows a slightly-improved version of NorthWind.

2.29. Relationship: meaning and characteristics ♦

A link between entities which is significant for this type of system



Degree of relationship



Optionality: does this relationship have to hold?



Name: link phrase E.g. Customer places orders.



Two names? A relationship can be named in either of two directions, depending on which entity you start from. Thus: (i)

Customer places orders

(ii)

Orders are placed by customers

Page 33 of 150

2.30. Why it’s important to relate entities Construction of the query in the previous example was eased because a proper design process had been undertaken in order to determine what entity types would be represented in the database, and how they would be related. This design process resulted in the simple entity relationship model presented in section 2.22. A line was used to link the two entity boxes together; this line had a crow's foot at the many end of the one to many relationship which analysis indicated exists between programme and student. So one of the most important results of analysis is to establish what entity types are, and how they are related. Relationships are links between entities which are significant for this type of information and which are normally true in reality. A relationship can have a name: actually, it can have two, one read from one end of the link, the other from the other end. In the earlier example of Student and Programme, we can recognise two relationships - "student is on programme" and "programme enrols student". The relationship is said to have a degree of 1:M, which can be read one to many. In this case, the relationship is said to be mandatory: that is to say, a student is not a student if they are not on a programme, and a programme is not a programme if it has no students. (The second assertion may not always be the case, and it is possible to represent a relationship as being optional from one or both ends.)

2.31. Degree of relationship (simplified) The degree of a relationship is an indication at the end being considered of whether more than one occurrence can be associated with one entity occurrence at the other



There are three basic possibilities: ∗

One to one: 1:1



One to many: 1:M



Many to many: M:N

For more information concerning the degree of a relationship, please see 3.5.4

2.32. Simple database design Design means deciding:



What are the main entities?



What are the attributes of each entity? ∗

What is the data type of each attribute?



Validation rules



Keys: primary and “foreign”



Relationships and linking

Before building, for example, a Microsoft Access database, it is essential to carry out at least some database design. These are the main points that have got to be addressed in even the most informal design exercise. The aim is to identify all the main entities and to give them the appropriate attributes. Having done that, consideration has to be given to the kind of data which each attribute will actually hold. Broadly speaking, the type of data is either numeric, text or something more special-purpose like a date. If text, consideration has to be given to the maximum number of characters that can be stored. Data which appears to be numeric may not in fact be so: telephone numbers, for example, must be stored as text, as indeed should code numbers like student numbers. Note the usefulness of a simple data dictionary here. If you decide that a customer number is to be five letters (as it is in NorthWind), then it needs to be five letters everywhere it is used. You record that decision in the data dictionary. Page 34 of 150

It is absolutely critical to identify the primary key for each entity type, and to ensure that there is a foreign key at the many end of any one to many relationship which is discovered as you think about how the entity types are related.

3.

A Simple Methodology for Designing Microsoft Access Databases A methodology is a coherent set of methods, linked by a common underlying philosophy. The methodology I suggest for designing and building access databases is described in this section, and elaborated on in the remainder of this document. Some of the specific methods are derived from the British SSADM, Structured Systems Analysis and Design Methodology; and one UML2 technique, that of Use Case analysis, is employed. However, the Simple Methodology presented here is greatly simplified to make it more appropriate to business use.

3.1.

Introduction – background to the methodology

3.1.1.Database, or something else? As the writer of the biblical book Ecclesiastes wrote nearly three millennia ago, "There is nothing new under the sun"! Whatever, it is certainly the case that most standard information handling problems encountered in business are common to more than one business. It is therefore very likely that you will be able to find a system which has been written for someone else but which is (more or less) directly applicable to an information systems requirement you are analysing. If you can find a packaged solution which is (more or less) applicable to your company or to a client whom you are advising, then you can save yourself a lot of effort and the client a lot of money. But what if there is no such package, or if it really doesn't suit the needs of the client / user? A Database may be, and often is, appropriate in many contexts - but still consider alternatives, such as spreadsheets, for more straightforward or small-scale work, or where system users are already highly familiar with spreadsheets. Spreadsheets are great in some contexts, and there is immense power in advanced spreadsheet packages like Microsoft Excel and Lotus 1-2-3. 3 And as we will begin to see as we become more knowledgeable about information systems modelling and more experienced in our use of relational databases, there are deficiencies also in the relational database approach. For this reason, newer techniques such as object oriented systems analysis and design are beginning to be used by IS/IT professionals. But for now, what you must do is work hard to get familiar with systems analysis using the structured approach, and database design using the relational approach.

3.1.2. What is a methodology? A methodology is a set of methods for tackling a particular class of problem; the methods should be linked by a coherent philosophy and be consistent with one another. Formal (mathematical) and semi-formal (strictly defined) methodologies have been defined for the analysis, design and construction of information systems. However, they are often too rigid, too prescriptive or quite simply too long-winded to be useful for people who are still learning the basics of the craft and who are tackling relatively small problems. The approach adopted in the rest of this document is methodical, but does NOT follow any one specific methodology; instead, it follows a simplified methodology of my own. 4 So, if you think a full2

UML is the Unified Modelling Language, a set of notations largely used by information systems professionals and particularly associated with a style of programming called Object Oriented or OO. The only UML notation we employ in this module is the Use Case diagram, UCD. 3 However, it is a serious error to use a spreadsheet when a database is necessary. Please see appendices 31 and 1 for a discussion of reasons why a database is often superior to a spreadsheet. Page 35 of 150

blown database is appropriate, you need to consider the steps outlined in the main parts of this section, many of them associated with a particular method. 5

3.1.3.Assumptions The approach described in this document is applicable only to relatively small applications: such as proof-of-concept prototype systems or perhaps end user computing systems. So:



The requirement is relatively small scale E.g. the specific needs of the department in which you work; or (part or all) of a small business



A prototype (and perhaps target system) can be implemented using Microsoft Access or a similar enduser orientated database Even if it’s too large for Access, people often create an initial (or “prototype”) system in Microsoft Access. This is then used to establish the complete requirements for an eventual full, or “target” system. Or the target system may be sufficiently small to be realisable using Microsoft Access.



You are acting as the Analyst or System Designer This document exists to help people design an effective database application. In business, it is normal to distinguish between those who use a system, the so-called users, and those who analyse, design and implement a system – the developers. This document treats you throughout as though you were acting in the developer role. What if you are the user as well as the developer? Then you are in the situation sometimes described as end user development, where a business person or student develops a system for their own use and perhaps also for the use of other members of their team or department. Wherever possible, get someone else – e.g. a member of your team – to act in the role of a true system user. Their perspective may be different but also complementary.

3.1.4. Introduction to modelling business information systems:

why we have chosen certain techniques ♦

4

Business situations and how to model them

Business information systems have several overlapping aspects:

Our preferred modelling technique for each of these “views”:

The situations and decisions that influence the system

Usage models – Use Case Diagrams

The processes through which

Process models – Data Flow

Some of the techniques I use are borrowed from the UK standard SSADM (Structured Systems Analysis and Design Methodology); the French equivalent is MERISE (see, for example, http://www.commentcamarche.net/merise/concintro.php3). Nevertheless, these methodologies in their full form are far too complex to be used by business people unaided. 5 Please note that what follows does not directly correspond to the structure of the assignment that you have to undertake, but that there are close parallels. Page 36 of 150

information flows through system Taking account of the life cycles of information

Diagrams

The manner in which data and information is organised

Data models - ERA Diagrams (Entity Relationship Attribute diagram)

3.1.5. What we’re trying to achieve together ♦

Three viewpoints - Trois cas de figure ∗

You need a system and you want someone else to build it for you Then you need to know how to specify what you need



You need a system and you have to build it yourself Then you need to know how to analyse and build what you need



You’re an entrepreneur and you want to build a business So: along with your business, marketing, organisational, etc. strategies: you need a systems strategy and process and data models



In every case – YOU have work to do!

3.1.6. Business Process Modelling: Documenting a Business Process ♦

Is itself a business process that involves naming business processes and subdividing them into their basic elements ∗

Helps clarify the problem the information system attempts to solve – requirements analysis



Can later be used as program specifications



Business Process Modelling is a hot topic associated with quality management



Business Process Reengineering (BPR) = the complete redesign of a business process using ICT

3.1.7. Why have we chosen the techniques we have? ♦

Entity relationship attribute model The ER model maps directly to tables and fields in commonly-used database management systems. It is much less complex than classes, the parallel in the more-recent but (very technical) object oriented (OO) approach.



Use case model This technique is intended specifically for use with business users, and it is reasonably visual. It is therefore a very good basis for a dialogue between you as system users and IS professionals. Page 37 of 150



Dataflow diagrams This technique is intended specifically for use with business users, and it is reasonably visual. It also breaks large problems down into smaller, moremanageable ones. It is therefore a very good basis for a dialogue between you as system users and IS professionals.

3.1.8.

Business Process Modelling

It has long been recognised that there is a need for a fundamental and high-level analysis of the business's processes. In fact, in early strategic information systems planning exercises, it was not uncommon to seek to do a data flow diagram and an entity relationship model for a complete organisation. However, these tools may be too low-level when all that is required is to identify potential systems requirements. More recently, specific business process modelling techniques have been developed. The approach which is gaining favour currently is that of the Business Process Management Initiative. See http://www.bpmi.org/ (checked 24/11/2008) This approach, though of great interest, is outside the scope of this document. Instead, the assumption is that a single business process is being “computerised”, that is, supported by a new computer-based information system.

3.1.9.

SSADM

SSADM itself is less widely used than once it was but remains important, not least because it is relatively easy for business people to understand when compared with more modern techniques. For a good worked example of all SSADM techniques, please see http://www.systemsanalysis.org.uk/ accessed 24/11/2008. Wikipedia (accessed 26/02/2008) has a useful summary of SSADM (Structured Systems Analysis and Design Methodology): http://en.wikipedia.org/wiki/Structured%20Systems%20Analysis %20and%20Design%20Method The following material was found at http://www.edrawsoft.com/SSADM.php accessed 03/01/2009.



Introduction - Structured Systems Analysis and Design Methodology (SSADM)

SSADM (Structured Systems Analysis and Design Method) is another method dealing with information systems design. It was developed in the UK by CCT (Central Computer and Telecommunications Agency) in the early 1980's. It is the UK government's standard method for carrying out the systems analysis and design stages of an information technology project. SSADM has been traditionally used for the development of medium or large systems. However, one variant of SSADM is 'Micro SSADM' which is for small systems. SSADM starts from defining the information system strategy and then develops a feasibility study module. These are followed by requirements analysis, requirements Page 38 of 150

specification, logical system specification and a final physical system design.



Structured Systems Analysis and Design Methodology (SSADM) Stages

SSADM consists of 5 main stages (which are broken-down in several sub-stages). The 5 main stages are:



Feasibility Study The Feasibility Study involves a high level analysis of a business area to determine whether it’s feasible to develop a particular system. Data Flow Modelling and (high-level) Logical Data Modelling can be used as technique during this stage.



Requirements Analysis In the Requirements Analysis stage requirements are identified and the current business environment is modelled, business system options are produced and presented. One of these options will be chosen then refined. Data Flow Modelling and Logical Data Modelling can be used as technique during this stage.



Requirements Specification In the Requirements Specification the functional and nonfunctional requirements are specified as a result of the previous stage. Data Flow Modelling, Logical Data Modelling and Entity Event Modelling can be used as technique during this stage.



Logical System Specification In the Logical System Specification the development and implementation environment are specified, and the logical design of update and enquiry processing and system dialogues are carried out.



Physical Design During the Physical Design the logical system specification and technical specification are used to create a physical design and a set program specifications.



Applicability of SSADM

Unlike rapid application development, which conducts steps in parallel, SSADM builds each step on the work that was prescribed in the previous step with no deviation from the model. Because of the rigid structure of the methodology, SSADM is praised for its control over projects and its ability to develop better quality systems. Most current developers find it too onerous in its application, however.

3.1.10.

MERISE

This is a French equivalent to SSADM. See, for example, http://www.commentcamarche.net/merise/concintro.php3 accessed 24/11/2008.

3.2.

Feasibility study Page 39 of 150

Research the basic situation and write up a natural-language description of the scenario. Go on to describe the purpose of the database, who will use it; why; and for storing what kind of data



Include a list of the basic, obvious system requirements (things the system must do) as enunciated by its users or their representatives



Set out the basic constraints – budget, timescale, existing systems, etc.



Can the user afford the hardware, software, development effort and training required to implement the system?



Do the benefits (financial and other) exceed the costs?



Is there sufficient time to build a system in this way, or will the user have to make do with a bought-in package?

The result of a feasibility study is usually a Go / No-Go decision, with the go sometimes given to a proposal whose scope is more limited than the original idea.

3.3.

Set out Project Terms of Reference The terms of reference for a project are an agreement between the client for a project and the people responsible for carrying out the project. The terms of reference describe the context and scope of the project, identify the client, implementers and project management approach, and set out the overall timetable for the project.

3.4.

Analyse the needs of users The next step is a thorough analysis of user requirements. You might use Use Case diagrams (see also Appendix 29) and Data Flow Diagrams (see also Appendix 30) for this purpose:

3.4.1. Identify business processes using a high-level Use Case diagram This should be done for a whole area of business, not just the part which you intend to computerise. The purpose of this step is to ensure that you have a good idea of what is, and is not, in the scope of a particular process to be computerised. This can be done using a high-level Use Case diagram; the technique is described in Appendix 29.

3.4.2. Identify detailed requirements for a process to be computerised: carry out Process Modelling This is done using Data Flow Diagrams, which can be used in an iterative, refining fashion both to understand the needs of users better, and to improve on the existing situation with models of a better, computerised, solution. The elements of a Data Flow Diagram (DFD) are:



External entities These are people and/or systems outside the scope of the system being modelled. They may also indicate required entities in the subsequent data model.



Processes These are business activities within the scope of the system being modelled, which process things and data in order to carry out an activity of value to the business. Page 40 of 150



Data stores Which also indicate required entities in the subsequent Data Model.



Flows DFDs are intended to identify flows of data.

For a tutorial on DFDs, see http://www.cems.uwe.ac.uk/~tdrewry/dfds.htm checked 24/11/2008. Appendix 30 describes the technique.

3.5.

Decide the purpose and basic contents of the database – Data Modelling The approach adopted here is to develop an Entity Relationship (Attribute) [ER; ERA] model (after the method first proposed by Peter Chen, 1976). What data has to be stored? And how is it related?

3.5.1. Basic Constructs of ER Modelling The ER model views the real world as a construct of entities and association between entities.

3.5.2. Deciding entity types ♦

An entity type is a class of real-world thing of which there is (usually) more than one occurrence



Its name is a noun or noun-phrase



As an initial indication: in a natural language description of the scenario, underline the names of things which occur more than once in the real world



Look for “persistence” - longevity, ongoing significance



Do not include reports - this is output information, we are looking for the structure of the underlying data Invoices may be stored as an entity type. However, in my view, an Invoice is a report which is created at a specific moment in order to seek payment from a customer. It can therefore be argued that there is no need to store its details in the database.



Do not include "calculated" items as attributes For example, it is usually wrong to store calculated per-ordered-item costs in a specific attribute such as amount, when these can quickly be calculated as (quantity * unit price) at the time of use of a query or report.



“Rationalise” (merge, remove) irrelevant entities



It isn’t always clear when something is an attribute of something else, or an entity type in its own right; however, an attribute should never be a list – if it is, this indicates another entity

3.5.3. Entities6 Entities are the principal data object about which information is to be collected. Entities are usually recognizable concepts, either concrete or abstract, such as 6

This material was in part found at http://www.edrawsoft.com/datamodel.php checked 18/10/2009. Page 41 of 150

person, places, things, or events which have relevance to the database. Some specific examples of entities are Employees and Projects. An entity is analogous to a table in the relational model. Entities can be classified as independent or dependent (in some methodologies, the terms used are strong and weak, respectively). An independent entity is one that does not rely on another for identification. A dependent entity is one that relies on another for identification. An entity occurrence (also called an instance) is an individual occurrence of an entity. An occurrence is analogous to a row in the relational model.



Special Entity Types ∗

Associative entities (also known as link or intersection entities) are entities used to associate two or more entities in order to reconcile a many-to-many relationship.



Subtypes entities are used in generalisation hierarchies to represent a subset of instances of their parent entity, called the supertype, but which have attributes or relationships that apply only to the subset. An example is B2B Customer, a specialisation of Customer. The Customer entity has the main attributes. A B2B entity then has additional attributes specific to B2B, for example, credit arrangements or contact details. Customer and B2B customer have a one to one relationship.

Associative entities and generalisation hierarchies are discussed in more detail below.



What are the main entities / tables? We now go on to decide which tables are necessary and how they link together. There should be a table for each class of real-world thing, or 'entity'.

3.5.4. Relationships A Relationship represents an association between two or more entities. Example of such a relationship might be: 1. Employees are assigned to projects 2. Projects have subtasks 3. Departments manage one or more projects Relationships are classified in terms of degree, connectivity, cardinality, and existence. These concepts are discussed below.



Relationships and linking How are the entity types inter-related? There are three basic possibilities, sometimes referred to as the cardinality of the relationship. Cardinality specifies how many instances of an entity relate to one instance of another entity. Ordinality is also closely linked to cardinality. While cardinality specifies the number of occurrences of a relationship, ordinality describes the relationship as either mandatory or optional. In other words, cardinality specifies the maximum number of related records and ordinality specifies the absolute minimum number of related records. When the minimum number is zero, the relationship is usually called optional and when the Page 42 of 150

minimum number is one or more, the relationship is usually called mandatory.



1:1 (one to one) In a one-to-one relationship, each record in Table A can have only one matching record in Table B, and each record in Table B can have only one matching record in Table A. This type of relationship is not common, because most information related in this way would be in one table. For example, it may not be necessary to have a separate credit reference entity; instead, its attributes could appear on the customer entity. You might use a one-to-one relationship to divide a table with many fields, to isolate part of a table for security reasons, or to store information that applies only to a subset of the main table. For example, you might want to create a table to track employees participating in a fundraising soccer game. The additional attributes for employees who are also football players would be stored in a football player table, linked one-to-one to employee. This is done because the vast majority of employees will not be football players. Similarly, you might have a general customer table, and then link it to a B2B table (for B2B-specific elements) and a B2C one. See also generalisation hierarchies below.



1:M (one to many) A one-to-many relationship is the most common type of relationship. In a one-to-many relationship, a record in Table A can have many matching records in Table B, but a record in Table B has only one matching record in Table A.



M:N (many to many) and their resolution into two 1:M, 1:N relationships to a new link entity In a many-to-many relationship, a record in Table A can have many matching records in Table B, and a record in Table B can have many matching records in Table A. This type of relationship can only be stored in a database by defining a third table (called a junction table, or a link or intersection entity) whose primary key consists of or includes two fields - the foreign keys from both Tables A and B. A many-to-many relationship is really two oneto-many relationships with a third table. For example, an Orders table and a Products table have a many-to-many relationship that's defined by creating two one-to-many relationships to an Order Details table. It is occasionally necessary to add another attribute to the key to ensure uniqueness – often this is a date/time field.

3.5.5. Fields: What are the attributes of each entity? Attributes describe the entity of which they are associated. A particular instance of an attribute is a value. For example, "DOLMAN Arthur" is one value of the attribute Name. The domain of an attribute is the collection of all possible values an attribute can have. The domain of Name is a character string. Attributes can be classified as identifiers or descriptors. Identifiers, more commonly called keys, uniquely identify an instance of an entity. A descriptor describes a nonunique characteristic of an entity instance. What attributes / fields are required? In other words: what characteristics or attributes does each table have? For example, an animal type entity has a primary key of type, and other attributes, such as the number of legs this kind of animal has, and its normal diet. Page 43 of 150

Each of the characteristics represents a different field in the table and to differentiate them they need a unique name. A database management system such as Access requires to be told the name of each field (attribute) and type of data (text, numeric, date etc.) which that field represents. If it is a text field the largest character size, e.g. the biggest name to be stored will need to be included.

 AVOID “repeating fields”- fields whose name is in the plural, or which imply a plural, which almost certainly requires a list of values - this is almost invariably a sign that an extra table is needed. As an example in a students database, do not make qualifications into fields of Student – a student already has many qualifications and will gain more. The very fact that the word qualifications is in the plural is an indication that the relationship between student and qualification is in fact many-to-many. So a good database design is:

Programme PK

Programme code

Qualification PK

Qualification

Programme name LMD level Programme leader surname Programme leader forenames

Student PK

Student no

Award

PK,FK1 Qualification Student surname PK,FK2 Student no Note that, in accordance Student with forenames the rule that the primary key of the one end of a many-to-may becomes ancode attribute of the many end – where it is known FK1relationship Programme as a foreign key – the entity type Award has attributes Qualification and Student no. Award result Student gender Very frequently, the combination of the foreign keys is the best primary key for the new entity type. However, it isbirthdate sometimes necessary to add a date or time attribute Student to make the key unique – this is arguably necessary here because it is possible to envisage a student achieving a qualification on more than one date. However, for simplicity, we have ignored this rare possibility here.

3.5.6. Data type: Domain For each field: it is necessary to consider the type of data it will contain. Numeric? Integer or real? etc. If text: how many characters? The combination of type, size and all its potential values is sometimes called the domain of an attribute.

3.5.7.Identify Domains This concept, which goes beyond the Chen model, is both wellbased theoretically and very useful in practice. A Domain is the list of all possible values of an attribute. Thus you might know that the set of all possible values of a Sex attribute is Male and Female (for Page 44 of 150

mammals); you might also choose to add the value Hermaphrodite (to cover worms). It is also very common to permit a Null value, meaning that for a particular individual we do not know what their sex is. However, with these four permitted values, we have defined ALL possible values of that attribute. From this we can state that a Sex attribute should be a 1-character Text field, and that a Validation rule should permit only the values M, F, H and (perhaps) space, representing . We have identified the domain of the Sex attribute. It is important to think about the Domain of an attribute for two reasons:



The Domain determines the data type, size and permitted values All attributes having the same Domain should have the same data type, size and permitted values. Therefore a Surname should be defined in the same way throughout a database implementation. Neither MS Access nor the vast majority of actual database management systems provide direct support for the Domain concept - instead, it is the responsibility of the implementer to ensure that all attributes which share the same domain are defined with the same type (e.g. numeric integer, text ….), size (e.g. long, double, 5-character text ….), and that appropriate validation rules are defined and enforced. In the case of the animal type entity, the number of legs attribute is an integer number in the range 2 to 1000. The data type is integer; the domain is the total set of possible values, in this case, 2, 4, 6, 8… 1000 (millipede!).



Validation rules: What rules apply to each field having this domain? Simple example: Sex may be male or female. All other values should be disallowed by a validation rule, which permits only M (male/masculine) or F (female/feminine) (and, perhaps, unknown) as values for a Sex attribute. Consider the validation rules for each data attribute. For example, in the animal entity, the attribute number of legs must be a value in the domain of all possible values. Values such as three and 5 are never valid. Consider setting a rule which does not permit these values. This has the benefit that it decreases the likelihood of storing bad data.

3.5.8. Classifying Relationships Relationships are classified by their degree, connectivity, cardinality, direction, type, and existence. Not all modelling methodologies use all these classifications.



Degree of a Relationship The degree of a relationship is the number of entities associated with the relationship. This is usually two, since only two entities are involved in any given relationship - such relationships are called binary relationships. This association between two entities is the most common type in the real world. The n-ary relationship is the general form for degree n. Special cases are the binary and ternary, where the degree is 2 and 3, respectively.



Binary relationships Page 45 of 150

This association between two entities is the most common type in the real world.

(a)



A recursive binary relationship occurs when an entity is related to itself. An example might be "some employees are married to other employees".

Ternary relationships A ternary relationship involves three entities and is used when a binary relationship is inadequate. Many modelling approaches recognize only binary relationships. Ternary or n-ary relationships are decomposed into two or more binary relationships. They are sufficiently rare to be ignored in the remainder of this document.



Direction The direction of a relationship indicates the originating entity of a binary relationship. The entity from which a relationship originates is the parent entity; the entity where the relationship terminates is the child entity. The direction of a relationship is determined by its connectivity. In a oneto-one relationship the direction is from the independent entity to a dependent entity. If both entities are independent, the direction is arbitrary. With one-to-many relationships, the entity occurring once is the parent. The direction of many-to-many relationships is arbitrary.



Type of relationship An identifying relationship is one in which one of the child entities is also a dependent entity. A non-identifying relationship is one in which both entities are independent.



Existence Existence denotes whether the existence of an entity instance is dependent upon the existence of another, related, entity instance. The existence of an entity in a relationship is defined as either mandatory or optional. If an instance of an entity must always occur for an entity to be included in a relationship, then it is mandatory. An example of mandatory existence is the statement "every project must be managed by a single department". If the instance of the entity is not required, it is optional. An example of optional existence is the statement, "employees may be assigned to work on projects".



Generalisation Hierarchies A generalisation hierarchy is a form of abstraction that specifies that two or more entities that share common attributes can be generalized into a higher level entity type called a supertype or generic entity. The lower-level of entities become the subtype, or categories, to the super type. Subtypes are dependent entities.

3.5.9. Keys: primary and secondary (“foreign”) What are the primary and foreign keys in the attributes of the entities you have identified? Page 46 of 150

There is one and only one primary key per entity type. One (sometimes more) field(s) will uniquely identify each entity in a database; therefore, we have to set it to be the primary key. The primary field of an animal patient might be its name or the owner name. However, both of these are bad choices. Why? What better alternative can you suggest? Patient also needs to contain a foreign key - the name of the animal type. Why?

 NEVER FORGET: if a table is at the many end of one (or more) one-to-many relationship(s), then the attribute or attributes which uniquely identify records in the One table must also appear as attribute(s) in the Many table. These one or more attributes, known together as a Foreign key, are essential because they link records in the Many table to a record in the One table. Therefore, it is necessary to have a foreign key in Episode corresponding to the patient-code attribute in Patient, and also a foreign key corresponding to the treatment-name attribute in Treatment. The combination of patient-code and treatment-name is not, however, sufficient in this case to act as the primary key – it is also necessary to include the date in order to create a unique key. As a rule of thumb if there are two or more columns within a given table which together are the logical way to identify that row (and the way you would always join to the table), then use those as a compound key, otherwise assign a separate auto increment column as a primary key.



Candidate keys There may be more than one possible candidate for use as the primary key of a table. For example, in an employee table, you could use either the company generated employee number, or the Social Security number. In this situation, we say that there are two candidate primary keys.



Choosing a primary key One primary key must be selected for the table. A primary key can sometimes be a compound key, that is it may consist of two or more elements, which in combination uniquely identify the entity occurrence. There may be several candidates, but each entity has one and only one primary key.



Entity integrity rule The entity integrity says that no field participating in the primary key of an entity may be null. Null means empty, or spaces, or zero, etc.



Multi-part primary keys Where a link or intersection entity is used to resolve a many to many relationship into two one to many relationships, it is common for each foreign key in the child entity to form a part of a compound primary key. Sometimes it may be necessary to add an additional part to ensure that the primary key is unique for each instance; most commonly, it is necessary to add a Date.



Foreign keys Page 47 of 150

In order to create a one to many relationship between two entity types, the primary key of the parent entity (or, much more rarely, another candidate key) is replicated in the child entity as the socalled foreign key. Foreign keys implement one to many (1:M) relationships in the following way. If two entity types are related 1:M, then the primary key attribute(s) (or, rarely, the alternate key attribute(s)) of the one entity MUST appear as attribute(s) of the many entity. This is because this is the only way in which the database software can “join” the many records to the one. Consider a situation in which students are on a programme. The entity types are Programme and Student, related 1:M. If the primary key of Programme is Programme_Code, then Student must also have a Programme_Code attribute.



Ensuring referential integrity The terms referential integrity, linking, Primary and Foreign keys and relationship can be described in this way: Two tables can be linked by a relationship. This link can be one-to-one (e.g. husband to wife), or one-tomany (e.g. one brand of car gives rise to many models of car - but each model has one and only one brand). Coordination is accomplished with relationships between tables. A relationship works by matching data in key fields - usually a field with the same name in both tables. In most cases, these matching fields are the primary key from one table, which provides a unique identifier for each record, and a foreign key in the other table. For example, employees can be associated with orders they're responsible for by creating a relationship between the Employee table and the Order table using the EmployeeID fields. You can ask Microsoft Access to enforce referential integrity: if a table such as patient is related to another table animal type, and referential integrity is enforced, then Access will only allow a new patient to be introduced if the animal type already exists in the animal type table. When you create the properties of a new relationship, you can specify the behaviour to be followed:



Insert



Update If a primary key is changed in an owning table, should the system automatically change the related foreign keys? The answer will usually be yes, and the option should be set. To be certain, it is necessary to model the ordinality of a relationship, as mentioned in section 3.5.4 and again in section 3.5.11.



Delete If a parent record is deleted, should the system automatically delete all the associated child records? Setting this option should only be done after careful thought!

3.5.10.

Normalisation

We should now go back and check each attribute list is:



Complete



Has the right attributes on the right entities

We may choose to use the formal relational data analysis technique called normalisation. This technique is described in appendix 3. It is a useful cross-check, and is not essential. Page 48 of 150

3.5.11.ER Notation There is no standard for representing data objects in ER diagrams. Each modelling methodology uses its own notation. The original notation used by Chen is widely used in academic texts and journals but rarely seen in either CASE (Computer Aided Software Engineering) tools or publications by non-academics. Today, there are a number of notations used, among the more common being Bachman, crow's foot, IDEFIX and SSADM.

Source: http://en.wikipedia.org/wiki/File:ERD_Representation.svg accessed 18/10/2009. All notational styles represent entities as rectangular boxes and relationships as lines connecting boxes. Each style uses a special set of symbols to represent the cardinality of a connection.



Showing relationships diagrammatically using the crow’s foot notation

Page 49 of 150

The symbols used in this document for the basic ER constructs are taken from the American Information Engineering tradition and are also called the crow’s foot notation (in French, patte d’oie).



Entities are represented by labelled rectangles. The label is the name of the entity. Entity names should be singular nouns.



Relationships are represented by a solid line connecting two entities. The name of the relationship is written above the line. Relationship names should be verbs.



Attributes, when included, are listed inside the entity rectangle. Attributes which are identifiers are underlined. Attribute names should be singular nouns.



Cardinality of many is represented by a line ending in a crow's foot. If the crow's foot is omitted, the cardinality is one.



Existence is represented by placing a circle or a perpendicular bar on the line. Mandatory existence is shown by the bar (which looks like a 1) next to the entity of which an instance is required. Optional existence is shown by placing a circle next to the entity that is optional.

There are many different ways of drawing entity-relationship diagrams. In most of this document, we show one-to-many relationships using the crow’s foot notation without particular concern for the ordinality. Where it is desirable or necessary to consider ordinality (whether or not a relationship is mandatory) we can use an extended set of symbols:

We have not been this precise in the remainder of this document.

3.5.12.Online tutorial For an additional online tutorial about entity relationship modelling, see http://www.cems.uwe.ac.uk/~tdrewry/lds.htm checked 24/11/2008. Note that this tutorial sticks rigidly to the SSADM modelling conventions and names and makes reference to Logical Data Structures, LDS. As it makes clear, “Logical data Page 50 of 150

structures are data models, and are sometimes called entity-relationship (ER) models or even entity-attribute-relationship models.” In other words, LDS is a synonym for Entity Relationship Model.

3.5.13.DFDs and ERDs – why both? How are they linked? ♦



DFDs are process models; BUT ∗

Data stores usually have to be stored in database, as one or more entity types



External entities may have to be stored in database, as one or more entity types

ERM is a data model ∗

Used to analyse data requirements, and to design database tables, attributes and relationships

3.5.14.Why BOTH Data and Process models?  To analyse requirements for data, you should create an Entity Relationship model (also known as a Data Model, an ER model and an ERA model). An ERA model is a complementary technique to process modelling, done for example using Data Flow Diagrams – both are necessary before the overall requirements of a system are understood. Process modelling using DFDs can be used in an iterative, refining fashion both to understand the needs of users better, and to improve on the existing situation with models of a better, computerised, solution There is no direct link between DFD process models and ERA data models. However, data stores and data flows give clues as to what data needs to be modelled:



External entities and data stores indicate required entities Although note that in some cases a data store is actually an updateable view of (that is, an updateable query on) one or more entities.



3.6.

Data flows to and from external entities indicate system inputs and outputs

Cross-check: entity life history This is another useful cross-check.

3.6.1.Cross-check DFD and ERA The data stores and external entities on the DFD must have counterpart entity types on the ERA diagram – although the correspondence is not necessarily one to one (it can be an updatable view of one or more entity types), there should be somewhere to store all the data indicated in the DFD.

3.6.2.Time dimension Ensure that, for all major entity types, there are processes which CReate, Update, and Delete (CRUD!) them. It may be necessary to create specific processes to carry out operations which create, update or delete entity types. But note that some systems do not ever delete data, instead, they may archive the data. Formal Entity Life History models exist as part of SSADM but are rarely constructed nowadays. It is usually sufficient simply to ensure that the Page 51 of 150

above points 3.6.1 and 3.6.2 are respected. See also http://www.cems.uwe.ac.uk/~tdrewry/modeling.htm#Modeling %20Techniques

3.7.

Model User System Interactions Work out what interactions will take place between system users and the information system you are creating. This can be done by using a low-level Use Case diagram or by other techniques not taught in the School.



System inputs and outputs Identification of these can be aided by Use Case diagrams, since these indicate who needs to use a system and for what. The Use Case diagram will indicate the basic interactions between the system and its users - some of these will be data input / update actions, others will involve information output. They can also be used to derive a list of system inputs and outputs – e.g. forms, reports and/or webpages. The technique is described in Appendix 29.

3.8.

Define required outputs: reports, forms, queries What is the database actually for? What Information is to be Output? (Query etc.). Think out what OUTPUTS are required from the system. This is the information that system users require FROM the system in order to be able to do their jobs more effectively, to make better decisions etc. From this, you can create a list of the information-yielding Forms and Reports required, and the Queries necessary to support those forms and reports.

3.9.

How will Input / Update be carried out (Forms etc.)? Think out what the necessary system INPUTS are, and when system users will be in a position to store that input data. How will Input / update be carried out (Forms etc.)? Although formal dialogue design methods exist, they are usually taught as part of specific HCI (Human Computer Interface) or webpage design courses. The approach recommended in this module is simply to identify the necessary forms, perhaps sketching them out on paper or in an MS Office product. The Use Case diagram produced as part of the modelling of User System interactions tells you where forms, etc, are needed: each interaction between a human user (actor) and a Use Case implies a need for Inputs and Outputs.

3.10. Work through your design on paper, whiteboard, etc. As far as you can, ensure that the system you are proposing will do the job you have outlined for it. Doing this properly is time consuming and is hard work, but it usually pays off in terms of avoided wasted implementation effort at the computer. It is also extremely useful in helping to identify the steps in a test plan, that is, the tests that you will have to carry out on the implemented system and the expected results of those tests. An excellent technique for improving the quality of your work is to work with team colleagues. This can either be explicit collaboration (e.g. working in pairs); or it can take the form of structured walkthroughs. Structured walkthroughs are discussed in appendix 5.

3.11. Implementing processes in Access The implementation of complex processes will involve at least macros and perhaps module programming in (for example) the Visual Basic for Applications programming language. However, for many simple database applications, this is not necessary. Instead, all that is required is to guide the user as to which form, query or report they should next be using. This can be done by creating an initial form, called a switchboard, and by the use of menus and submenus. This approach leaves the choice of what step to take next to the user, a style of use which we sometimes call event-driven. Page 52 of 150

3.11.1. System data processing In order actually to carry out the data processing, Microsoft Access supports update, append and delete operations which operate on complete sets of records defined by corresponding queries. Therefore, the contents of the database can be changed under the control of queries. Beyond that, it is necessary to use a programming language. The specification of data processing is carried out by means of descriptions of the algorithms involved, using techniques such as pseudocode. The language supported within Access is Visual Basic for Applications. The use of pseudocode and of programming languages is beyond the scope of this document.

3.12. Define (“design” in Access terms) the database: Build a prototype This is the stage at which you "tell" Access what your design decisions are, as you design the tables, attributes, forms, queries, reports etc. There's a fair bit of work here, but if you have done your "homework" in terms of carefully carrying out the steps already described, and have already taken the trouble to become reasonably competent in your use of Microsoft Access as a package, you should be able to concentrate on a good Access implementation of a design in which you can already place some confidence.

3.13. Refine / iterate / implement At each stage, but particularly at the design stage and early implementation stage, you will be learning more about what your system should be doing and you may also be finding problems with the way it currently works. Where necessary, revise the earlier work and then go on to ensure that later work is in line with the changed requirement. If you do this conscientiously, there should not be too many surprises at the system testing stage.

3.14. Test the database As you implement the system, test as many aspects as you can as you go along. If, for example, you have created a table and a form by which to input and/or update data in that table, ensure that the form works and that the data you can input is sensible and appropriately validated. As you add new forms and subforms, you should find that you can both test them, and also go back to forms and queries you implemented earlier and further check that they behave as they ought to do. When you find problems, do not rush to solve them too quickly. It is wise to carry out as many tests as you can, building up a “bug list”, or register of problems / deficiencies. Tackle these in small batches, rather than one at a time, because you will often find that apparently different problems are linked and have the same underlying causes.

3.15. Obtain User Feedback Eventually, you will have a working system which does not have too many obvious implementation faults. However, it is more than likely that it will not be what the would-be system users expected. It is often the case that users react badly when first shown a system. This is likely to be for one of the following reasons:



You, the designer / implementer, have made mistakes in implementing a user's requirements You may well have omitted a user requirement, or got it wrong. Such deficiencies are bugs, and you have to put them right.



You, the designer / implementer, have failed to understand some aspect of the user's requirements of a system Page 53 of 150

The fault in this case may lie with you or the user or both, but you have to reach some agreement on what needs to be done to put the system right, and it will normally be at your expense.



The user has not themselves sufficiently thought through what they require of a system Even though the "fault" here is more obviously with the user, you still have to reach some agreement on what needs to be done to put the system right.



The user, inspired by seeing the implemented system, decides that they would like the system to do more Great! A business opportunity! You enter the required additional functionality on a document which you may grandly term the Enhancement Register, work out the implications in terms of additional design and implementation effort, and tell the user what the enhancements will cost in terms of later delivery and / or an increased bill. You should never allow yourself to get dragged into a cycle of continuously responding to such changes as you go along, without explicit renegotiation of the terms of reference agreed at the outset of the project.

3.16. Refine the system by Iteration The ideal in any area of activity is "Right, on time, first time, and every time" - that is, wherever possible, we should aim to avoid repeated work and wasteful repetition. However, even the most experienced information systems designers do not achieve this when designing and implementing information systems. It is almost invariably necessary to go back to earlier stages in the simple methodology, repeating the work. Note that if you change work carried out at an early stage in the process, you will normally have to repeat not only that step but also all subsequent steps. Professional systems analysts and designers build in a significant amount of time and budget into their original plan in order to cover this reality.

4.

Putting database design theory into practice 4.1.

Design aids Use some or all of the following as appropriate to the context in which you are working.



Diagramming tools E.g. Microsoft Visio Professional, SmartDraw, or EDraw. For more on using Visio, please see Appendix 4.



Using a spreadsheet as a simple data dictionary Professional developers often make use of a database about the design decisions they make. This database about databases is sometimes called a Data Dictionary, sometimes a Repository. For the small-scale scenarios dealt with in this module, it is possible and sensible to use a relativelystraightforward spreadsheet as a simple Data Dictionary. Such spreadsheets can be a useful aid when recording attributes and rearranging them subsequently. See section 24.1 for an example of how this can be useful.



CASE tools CASE tools -- CASE here stands for computer aided software engineering (and has nothing to do with use case diagrams!) -- are tools which are used by professionals to manage the development of large and complex systems. The use of these tools is beyond the scope of this document.

4.2.

An Exercise Page 54 of 150

Assume that you are the people who originally designed the University of Anytown database used as an example later in this booklet. Now that you know about the various stages required to analyse user needs and design a database solution, carry out those steps for yourself for a business school. Go through the various stages and carefully document what you do at each stage. Or, if you are responsible for database design in an assignment I have set, do the same thing for that database. This is a significant piece of work - it will probably take you at least a few hours of effort, and may well take you a week of on-and-off effort. When you have finished the University of Anytown database design, compare the results of your work with those of the original analyst / designers. You should find that you have reached similar or better conclusions.

4.3.

Achieving real competence in Database Design If you can successfully tackle this exercise and test, you have achieved a reasonable mastery of database design and should go on to a really difficult task. So you need to test out your skills! You can do some work on one of the suggested scenarios that follow. Alternatively and/or additionally, move on directly to your own scenario, such as the one you are working on as part of an assignment set by your teachers.

4.3.1. Documented scenarios I have produced a companion document, called Database design and implementation cases, which is available on request. This describes the following scenarios.



Instruction Training Company This requirement is separately documented as "Instruction Training Company".



Dating agency This requirement is separately documented as "Seekers Dating Agency".



Filing system This requirement is separately documented as "Filing System".



Video Collection This requirement is separately documented as "Video Collection".



A catalogue of your record, CD or tape collection - To include entities such as Album, Artist, Track, and attributes such as media (e.g. CD, tape, vinyl) and play-time (length of track in minutes and seconds). This requirement is separately documented as "Media System"



A contact / correspondence management system This requirement is separately documented as "Contact Manager". Page 55 of 150



A Student CD Library system This requirement is separately documented as "Student CD Library".

4.3.2.Suggested but undocumented scenarios Decide on a scenario of value to you in your work or in your leisure time in which a database might be of value. In each case, you should prepare a scenario document, similar to the ones listed above, which sets out the main features of the problem area you intend to tackle. Here are some ideas - you can of course come up with your own idea, but you are advised to discuss it with a tutor just in case it is inappropriately complex:



Details of the modules (“courses”) you have studied, the lectures and classes which formed part of that module, and a diary of significant events.



Details of your personal research - to include entities such as references, authors, citations. Be aware that this is a fairly difficult example to tackle - for example, how will your database design cope with an article by many authors (NOT just two or three, maybe five?).

4.3.3.Further study Before going much further with database design, and assuming you are full of enthusiasm for database, you are advised to study the topic in a textbook, learning more about concepts such as normalisation, which is a very useful technique for ensuring that each table has exactly the right attributes. Normalisation is introduced in appendix 3. For a basic treatment, see [Hughes 2000]; for an advanced treatment, see [Date 2003]

5.

More about Databases 5.1.

What is a database? A database is a collection of inter-related data, stored together without unnecessary redundancy, which can serve multiple uses and applications It is the implementation on a computer of the data model. Databases were originally a reaction to uncoordinated “conventional files”. A database is a collection of inter-related data, stored together without unnecessary redundancy, which can serve multiple uses and applications. It is the physical implementation of the data model (entity-relationship model) created by the data analyst. Entities are implemented as tables. Entity occurrences are records. Attributes are implemented as fields. Key relationships may be enforced - in Access, for example, it is possible to "enforce referential integrity". This ensures that in a one to many situation, the many-end record cannot exist unless a one-end record already exists. This is good, in that it prevents unwittingly creating records that are not linked to anything else. The emphasis throughout is on effective data retrieval in order to answer arbitrarily complex questions. Page 56 of 150

5.2.

The history of databases ♦

Hierarchic and network databases were invented in the 1960s.



1970: Dr. E.F. Codd introduces the concept of the relational database In 1970, the expatriate British researcher Edgar Codd was working for IBM in the United States. He suggested that a better basis for database implementation was relational set theory, a mathematical approach.



Concepts are relatively simple and have a strong theoretical basis, that of mathematical set theory In a relational database, care is taken to keep all the data for a set of like entities in what is mathematically a relation or a set, but what we would probably refer to as a table of records: e.g. student. Each different kind of entity is kept separately: so we might also have a programme entity (or relation or table - these are equivalent terms). Student records are linked to a programme record by means of a shared linking attribute, in this case, the programme code.



Relational model does have limitations but is currently the dominant paradigm (way of thinking)



Object databases are just beginning to become commercially significant and might dominate eventually The relational database paradigm has been dominant since about 1980, and has yet to be displaced by the more recent object database approach.



5.3.

Oracle – hybrid object-relational approach

Implementing data models in MS Access ♦

Entities are implemented as tables ∗

Entity occurrences are records or rows in the tables



Attributes are implemented as fields



Key relationships should almost always be enforced ∗

Set cascade update yes; think hard about cascade delete

Page 57 of 150

The diagram shows a situation in which a foreign key, programme code, in a Student table is being linked to the corresponding programme code in the Programme table. It is necessary for a Programme to exist before a Student can be registered. It is probably appropriate automatically to cascade any change to the programme code in Programme to each Student record having that code. By contrast, the deletion of a Programme might not require the deletion of linked Students (who perhaps studied on the programme before it was deleted).



5.4.

5.5.

The emphasis has to be on effective data retrieval in order to answer arbitrarily complex questions

What is a database management system? ♦

Software which manages a database



Implements entities as tables, maintaining and enforcing relationships



Deals with all the component disc files



Provides functions such as ∗

Table creation and structural updating



Insert, update and delete operations, on individual records and on complete sets of records



Queries, reports and forms

First challenge: database design A professional systems analyst will carry out parallel data analysis and also process modelling. The results of data analysis will be compared with Data Flow Diagrams (process view): data stores and flows give clues as to what data needs to be modelled. Page 58 of 150



Decide the purpose of the database



Analyse the needs of its users in data terms ∗

Compare with Data Flow Diagrams (process view): data stores and external entities give clues as to what data needs to be modelled



Informal cross-referencing between process and data models Informal cross-referencing between process and data models may point up potential problems in one or both sets of analysis.



Design the database on paper Design the database on paper, define the computer implementation, and only then think about implementation! It is very important to give consideration to how input / update be carried out (forms etc.); in connection with electronic business it is worth noting that nearly every interaction with the system will take place through a web interface.

5.6.

Second challenge: database implementation ♦



5.7.

Define the computer implementation ∗

Tables



Attribute types: e.g. student number is 11 chars text, split three letters and eight digits

How will input / update be carried out? ∗

Forms



Web interface

An inductive approach ♦

Learning from examples: ∗

Student: Anytown case See section 9.



NorthWind2003 (complex but useful in an e-business context; based on standard Microsoft NorthWind example) NorthWind2003 is a complex but useful example in an e-business context; based as it is on the standard Microsoft NorthWind example

5.8.

What Is a Database? ♦



A structured collection of ELECTRONICALLY STORED data ∗

Controlled & accessed through computers



The structure is given by predefined relationships between predefined types of data items

May include many types of data

Page 59 of 150

5.9.

What Is a DBMS? ♦

Database management system (DBMS) = an integrated set of programs, used to define, update, and control the database



Examples ∗

Small

MS Access, OpenOffice.org Base



Medium

MS SQL Server, MySQL, PostgreSQL



Large

ORACLE, IBM DB/2

Page 60 of 150

SECTION 2 – USING MICROSOFT ACCESS TO BUILD GOOD DATABASES 6.

Introduction to Microsoft Access The Microsoft Office Access relational database management system is software to manage databases.

6.1.

What is a database management system? Microsoft Access is an example of a Relational DataBase Management System (RDBMS). It is positioned in the marketplace as an office productivity aid. As such, it has limited resilience and recovery facilities, can be used by more than one person at the same time, but is NOT suitable for "mission-critical" high reliability or high performance applications beyond a handful of concurrent users, Access runs out of steam! Access - and other small scale databases, such as Filemaker Pro - offer the standard features expected of such programs:

6.1.1. Software which manages a database 6.1.2. Implements entities as tables, maintaining and enforcing relationships 6.1.3. Deals with all the component disc files In Access, there is only one file per database. Bigger databases have much more complex file structures stored on disc.

6.1.4. Provides functions such as ♦

Table creation and structural updating



Form-based insert, update and delete operations, on individual records and on complete sets of records



Query and report facilities Access provides powerful query and report facilities. When you define a query, what Access does on your behalf, behind the scenes, is to create and then run a query expressed in a powerful industry-standard programming language called SQL (Structured Query Language). You can in fact see the generated SQL if you use View / SQL View.

6.1.5.An approachable programming language Access offers Visual Basic for Applications (VBA) and SQL.

6.2.

Important facilities of more advanced DBMS More advanced DBMS offer additional features, and are usually structured to operate in a socalled "client server" situation. In a client-server application, client programs running on PCs or other low-power computers present data to individual users. The data itself is managed and stored on a database server computer to which all the client machines are connected. The clients may either be directly connected to the distant database server, or they may run a local database (usually Access) which connects to the distant database server (e.g. MS SQL Server, Oracle ….) using the Open DataBase Connectivity feature ODBC, or they may Page 61 of 150

present the data in the database by means of web pages. The architecture then looks something like:

Client Server Architecture

Client workstation 1. This workstation simply displays web pages, which asks for using the HTTP protocol.

Workstation HTTP carrying HTML

Client workstation 2. In addition to displaying web pages, this PC runs a copy of Microsoft Access which can alsoquery the remote database (SQL Server or ORACLE or whatever) which runs on the main database server computer

Local server

Server Workstation

HTTP carrying HTML; plus ODBC (open database connectivity feature) for database queries

Web Server – responds to HTTP and delivers HTML; the web pages can include database forms, with data stored on the database server computer.

Internet

Servers

6.3.

Database server computer. It is this central computer that runs the company’s main Support many users and multiple applications transaction processing andan individual ∗ MS Access does this, sort of ... database database may support a handful ofsoftware. users

Further facilities of more advanced DBMS ♦

♦ Depend upon a data dictionary (sometimes called a repository) Page 62 of 150

Central computers

♦ Integrate with the CASE (Computer Aided Software Engineering) tool which created and maintains the data dictionary ♦

Implement resilience and recovery mechanisms

These things include roll-forward and / or roll-back mechanisms so that complete transactions (only) are carried out. Such mechanisms are essential to prevent situations where, for example, money leaves one company’s bank account, but never reaches another company’s.



Enforce security

Only privileged users should be able to see things like payroll data.

6.3.1. Other RDBMS Examples of so-called "industrial strength" databases include



The “Big Three”

These are the commercial databases at the heart of most large enterprises:

(a)

Microsoft SQL Server

(b)

Oracle Corporation ORACLE

(c)

IBM DB2



Computer Associates INGRES

The first commercial-strength database management system, INGRES is now an Open Source product.



MySQL

MySQL is a relational database management system which has more than 11 million installations. MySQL is popular for web applications and acts as the database component of the LAMP, BAMP, MAMP, and WAMP platforms (Linux/BSD/Mac/WindowsApache-MySQL-PHP/Perl/Python), and for opensource bug tracking tools like Bugzilla. Its popularity for use with web applications is closely tied to the popularity of the PHP programming language and the Ruby on Rails programming framework, which are often combined with MySQL.



PostgreSQL

These systems are capable of supporting very large numbers of users and transaction rates measured in tens or hundreds every second. They are typically used to store so-called "corporate" databases, and are "overkill" in the context of the personal and team productivity applications to which Access is well-suited.

6.4.

Why we want business students to learn Access We expect our students to become competent Access users and (to a limited extent) designers. We start off with Access for a number of reasons: Page 63 of 150

6.4.1. The relative ease-of-use of MS Access Industrial strength databases, such as ORACLE, are harder to learn and less well integrated into the PC environment, whereas MS Access is easily accessible (sic!)

6.4.2. MS Access is easily obtained It forms a part of the Microsoft Office Professional and Premium office suites (although it is included neither in the Small Business Edition nor the Student edition). MS Access is available in French on ESC Rennes student workstations for students who do not have a copy on their own personal machine. Alternatively a free copy can be obtained by means of the MSDN Academic Alliance membership of the School.

6.4.3.MS Access supports usable programming languages MS Access supports and integrates with the two most widely-used programming languages associated with personal productivity aids and with databases. These two languages are Basic and SQL. In fact, MS Access provides Basic and SQL in a number of ways:



Visual Basic for Applications (VBA) The dialect of Basic supported by MS Access is VBA - the same language also used internally by several other Microsoft Office products, including Excel and the Visio business drawing package.



Visual Basic (VB) itself Access can act as the so-called "Jet Engine", providing database facilities to programs written in Visual Basic.



SQL: Structured Query Language SQL, Structured Query Language, is a database query language that was adopted as an industry standard in 1986. In their SQL standard, the American National Standards Institute ANSI declared that the official pronunciation for SQL is "es queue el". However, many database professionals have taken to the "slang" pronunciation sequel that reflects the language's original name, Sequel, before trademark conflicts caused IBM to insist on the ‘official’ pronunciation. Access supports a reasonably-comprehensive subset of the ANSI SQL 92 standard. This provides the basis for a high degree of integration with other databases - so that, for example, an Access database can act as a client to an industrial-strength database running on a server computer.

7.

MS Access implementation of data models Access provides a direct implementation of many of the features of a data model (entity relationship model). Having completed an ERM, it can be implemented in Access as:

7.1.

Tables, one per entity type

7.2.

Fields, one per attribute

7.3.

Records, one per entity occurrence Page 64 of 150

7.4.

Attribute types in MS Access What data type should you use for a field in a table? Decide what kind of data type to use for a field based on these considerations: • What kind of values do you want to allow in the field? For example, you can't store text in a field with a Number data type. • field?

How much storage space do you want to use for values in the

• What types of operations do you want to perform on the values in the field? For example, Microsoft Access can sum values in Number or Currency fields, but not values in Text or OLE Object fields. • Do you want to sort or index a field? Memo, Hyperlink, and OLE Object fields can't be sorted or indexed. • Do you want to use a field to group records in queries or reports? Memo, Hyperlink, and OLE Object fields can't be used to group records. • How do you want to sort values in a field? In a Text field, numbers sort as strings of characters (1, 10, 100, 2, 20, 200, and so on), not as numeric values. Use a Number or Currency field to sort numbers as numeric values. Also, many date formats will not sort properly if entered in a Text field. Use a Date/Time field to ensure proper sorting.

7.5.

Permitted data types in MS Access The following table summarises all the field data types available in Microsoft Access, their uses, and their storage sizes.

Data Type

Use

Size

Text (Français: Texte)

Text or combinations of text and numbers, such as addresses. Also numbers that do not require calculations, such as phone numbers, part numbers, or postal codes.

Up to 255 characters (if you need more text in a field, you have to use a Memo field).

Memo (Mémo)

Lengthy text and numbers, such as notes or descriptions.

Up to 64,000 characters.

Number (Numérique)

Numeric data to be used for mathematical calculations, except calculations involving money (use Currency type). Set the FieldSize property to define the specific Number type.

1, 2, 4, 8 bytes.

Byte (Octet: Numérique 1 octet)

Stores numbers from 0 to 255 (no fractions).

1 byte

Integer (Entier: Numérique 2 octets)

Stores numbers from –32,768 to 32,767 (no fractions).

2 bytes

Long Integer (Entier Long: Numérique 4 octets)

(Default) Stores numbers from – 2,147,483,648 to 2,147,483,647 (no fractions).

4 bytes

Page 65 of 150

Single (Réel simple)

Stores numbers from –3.402823E38 to – 1.401298E–45 for negative values and from

4 bytes

1.401298E–45 to 3.402823E38 for positive values Double (Réel double)

Stores numbers from – 1.79769313486231E308 to

8 bytes

–4.94065645841247E–324 for negative values and from 1.79769313486231E308 to 4.94065645841247E–324 for positive values. 15 decimal places. Date/Time (Date/Heure)

Dates and times.

8 bytes

Currency (Monétaire)

Currency values. Use the Currency data type to prevent rounding off during calculations. Accurate to 15 digits to the left of the decimal point and 4 digits to the right.

8 bytes

AutoNumber (Numérotation automatique)

Unique sequential (incrementing by 1) or random numbers automatically inserted when a record is added.

4 bytes

NB: if you use an automatically numbered field as part of the primary key of a table, and you also have to use it as the foreign key in a linked table, the data type required in the many end is long integer, which is how in fact an AutoNumber field is stored. Yes/No (Oui/Non)

Fields that will contain only one of two values, such as Yes/No, True/False, On/Off.

1 bit

OLE Object (Liaison OLE)

Objects (such as Microsoft Word documents, Microsoft Excel spreadsheets, pictures, sounds, or other binary data), created in other programs using the OLE protocol, that can be linked to or embedded in a Microsoft Access table. You must use a bound object frame in a form or report to display the OLE object.

Up to one gigabyte (subject to disc space!)

Hyperlink (Hyperlien)

Field that will store hyperlinks. A hyperlink can be a UNC (Universal Naming Convention) path to a file, or a URL.

Up to 64,000 characters

Assistant for choosing from a list (Assistant Liste de choix)

Creates a field which permits you to choose, from a scrolling list, a value which comes either from another table or from a specified list of permitted values. If you choose this option, a wizard appears to help you to define the field.

The same size as the primary key of the corresponding table. In the (common) case where this is an AutoNumber field, it will be 4

Page 66 of 150

bytes in length.

7.5.1. Use of Number or Currency fields Microsoft Access provides two field data types to store data containing numeric values: Number or Currency. Use a Number field to store numeric data to be used for mathematical calculations, except calculations that involve money or that require a high degree of accuracy. The kind and size of numeric values that can be stored in a Number field is controlled by setting the FieldSize property. For example, the Byte field size will only store whole numbers (no decimal values) from 0 to 255 and occupies 1 byte of disk space. Use a Currency field to prevent rounding off during calculations. A Currency field is accurate to 15 digits to the left of the decimal point and 4 digits to the right. A Currency field occupies 8 bytes of disk space.

7.5.2. Storing telephone numbers Quite a lot of data which we informally refer to as numbers are in practice nothing of the kind! For example, telephone numbers include nonnumeric characters such as plus, spaces, and leading zeros: +44 (0)789 12345. For this reason, such fields must be stored as text. See the next section for details of how to ensure that the data stored in the fields is correctly formatted.

7.5.3. Controlling data entry formats with masks When you have several people entering data in your database, you can define how users must enter data in specific fields to help maintain consistency and to make your database easier to manage. For example, you can set an input mask for a form so that users can only enter telephone numbers in the Swedish format or addresses in the French format. You can set a specific format for the input mask, and select another format so that the same data is displayed differently. For full details of masks and how to use them, please refer to Microsoft Access documentation available online: http://office.microsoft.com/enus/access/HA100964521033.aspx#2

7.6.

Keys 7.6.1. Candidate keys If there is only one candidate key, it has to be the primary key, and the comments below for primary key apply. If a key is a candidate key but not a primary key, it is wise to set additional properties: indexed - null forbidden; and duplicates not allowed.

7.6.2. Primary key There is only one primary key per table (although the single primary key may have multiple attributes within it -- please refer to the next section). If the key has only one part, select it, and use Edit / Primary key to set the attribute as primary key.



Primary keys in MS Access A commonly-used technique in Access is to use an AutoNumber field as a primary key attribute. An AutoNumber field is in fact a Long Integer value. For this reason, the data type of a corresponding foreign key field should be set to Long Integer. Page 67 of 150

7.6.3. Multi-part primary keys The primary key may be multipart. To create a multipart primary key in Access, select the first field, then, holding the control key, select the second and subsequent parts. Once all parts of the primary are selected, use Edit / Primary key to set the attribute as primary key.

7.6.4. Entity integrity rule This rule states that no field participating in the primary key of an entity may be null. This rule may be enforced in Microsoft Access as follows: set the Null Forbidden property for each attribute which participates in the primary key.

7.6.5. Foreign keys In Microsoft Access, a foreign key is created by creating attributes which correspond to the primary key of the one end of a one to many relationship in the many end. These attributes must have the same data type and size as the attributes in the primary table. If the primary key is an auto number field, the attribute in the many table should be declared as a long integer. Then a one to many relationship should be established, as is described in the next section.

7.7.

Relationships Defining relationships in Access involves you in adding the tables you want to relate to the Relationships window, and then dragging the primary key field from one table and dropping it on the foreign key field in the other table. The kind of relationship that Microsoft Access creates depends on how the related fields are defined:



One-to-many relationship A one-to-many relationship is created if only one of the related fields is a primary key or has a unique index. This is usually the case.



One-to-one relationship A one-to-one relationship is created if both of the related fields are primary keys and / or have unique indexes. Sometimes Access recognises this automatically, as here, when a B2B customer table is being created to hold fields specific to B2B customers:

Page 68 of 150

The result is:

Sometimes Access will not automatically recognise a one to one relationship and you may need to force a one-to-one relationship; you do this by setting the index property of the foreign key attribute to duplicates not allowed. So, if we have this B2B table which we want to link back to Customer:

Then we have to set a property on the Indexed field CustomerID: Page 69 of 150



Many-to-many relationship A many-to-many relationship is really two one-to-many relationships with a third table whose primary key consists of7 two fields - the foreign keys from the two other tables. This has already been discussed in section 2.21

7.7.1. Relationships and linking: Enforcing referential integrity where appropriate Should you enforce referential integrity? As you define a 1: M relationship, you are invited to check the box which enforces referential integrity. You should normally do this - it means that you cannot create a child record for a non-existent parent record, and this is a powerful and useful validation / data integrity constraint in nearly every case. Do NOT do this if your ER model shows the relationship to be optional. To summarise why you might want to use referential integrity: if a table such as animal patient is related to another table animal type, and referential integrity is enforced, then a database management system will only allow an actual patient record to be inserted if the type of the animal already appears in the animal type table. This is a highly-desirable constraint, since it ensures that questions to which a precise answer is needed actually get one. Without it, someone might erroneously update the database to say of Fido that he is a dawg, and a query which lists all dogs would omit Fido. If you do set the option to enforce referential integrity, it is common also to set the option for Cascade update; it is potentially dangerous to set the option for Cascade delete, and you should only do this if you are certain of what you are doing.

7.8.

System outputs 7.8.1. Queries A query is a temporary results table resulting from joining together fields taken from one or more database tables. A query can also include calculated fields.

7.8.2. Reports Reports are comprehensive summaries of a situation, and normally involve data from several tables. As such, it is based rather on a single query than on a single table. A report is frequently intended to be printed, rather than viewed on-screen.

7.8.3. Forms 7

Or, includes them, along with another attribute which ensures uniqueness, usually a date. Page 70 of 150

Forms are used to get data into a system, and may also be used to get information out -- see the next section.

7.9.

System inputs 7.9.1. Forms, sub-forms and their use with 1: M and M: N relationships A form can be used to input data into a table. Where two tables are linked by a one to many relationship, it is common to use the form and an associated sub-form. A subform is a form within a form. The primary form is called the main form, and the form within the form is called the subform. A form/subform combination is often referred to as a hierarchical form, a master/detail form, or a parent/child form. Subforms are especially effective when you want to show data from tables or queries with a one-to-many relationship. For example, you could create a form with a subform to show data from a Customer table and a Cars table. One Customer can own many Cars. Conversely, one Car can be owned by only one Customer at a time. The data in the Customer table is the "one" side of the relationship. The data in the Cars table is the "many" side of the relationship. From a Customer-based form, a subform of type datasheet or continuous form can show the details of all the various cars owned by that customer. Alternatively, the form/subform relationship can be used in the opposite sense, since with a Service record displayed, a user might want to show the details of the (one) car which is being serviced. Subforms can be nested, that is, a 1:M:N situation such as Customer : Order : Order Detail can be implemented as a form containing a subform which itself contains a subform. Unfortunately, there is no straightforward way to show the three tables which participate in a many to many relationship (for example, Order, Product and the link table OrderDetail). Often, it is adequate just to use two (different) form to sub form combinations. Where it is necessary to show the contents of all three tables at the same time, a technique which is frequently applicable is to have a form and subform relationship, with the link between the subform table and its other owner being implemented as a combo box. An example of the use of this technique is provided in appendix 2.2.

7.9.2. Field-specific validation checks In Microsoft Access, these may take the form of simplified Visual Basic rules, such as that a field must be either M or F. Where the list of permitted values will never change, such as in the case of gender, it is sensible to include the rule as a property of the attribute being defined. When the valid values form part of a variable list, it is probably better to set that list up as a separate table and to enforce referential integrity – see the next section.

7.9.3. Using relational integrity to carry out inter-table validation checks Where two tables are linked in a one to many relationship, it is usually good practice to enforce referential integrity. See section 7.7.1. This makes it impossible to introduce a child record for a non-existent parent; this is often of considerable value in improving the design of a database. A variant of this technique involves the specific identification of so-called lookup tables. A lookup table contains the valid values of an attribute. By making the lookup table a parent entity to the table whose values are to be verified, it becomes impossible to enter “bad” data, that is, data not authorised by the lookup table. In this example, the grade attribute of a student’s result in a module has been made into a lookup based on the valid values stored in the parent Grade table. It is therefore impossible to record an invalid grade.

Page 71 of 150

7.9.4. Table-level checks on forms On a form, it is possible to cross check fields. For example, you might not allow the title Mr for a person whose gender is female. However, to do so requires the use of VBA.

7.10. Implementing processes 7.10.1.Data processing in Access People expect systems to do things for them! This normally involves some amount of data processing or transformation

7.10.2.Functional elements in Access ♦

Implicit data processing E.g. a query involves joining two or more tables and generating a single unnormalised table which is the required result - Access is doing a lot of processing to achieve this, albeit it is not particularly visible to the system user.



Calculations: using expressions in Microsoft Access A query may involve arithmetic elements, such as creating totals and subtotals. When defining the query, instead of specifying a field, specify a calculated field. The syntax is exemplified in: Amount : [Quantity] * [Unit price] The operators (such as * for multiply) and functions (such as ) permitted in expressions are documented at http://office.microsoft.com/enus/access/HP051866381033.aspx which also describes the Expression Builder, which considerably eases the task of building valid expressions. Expressions are also used in forms and reports.



SQL The set-processing language in Access is SQL: Structured Query Language, which is an ANSI (American National Standards Institute) standard language. As the name suggests, SQL is a means of asking queries (questions) of a database and getting back answers. Part of the power of the relational data model is that, provided that the database consists of normalised entities, it is possible to ask almost any arbitrarily-complex question and get an answer. Page 72 of 150

However, SQL is more than a conventional query language. It also provides set manipulation facilities, that is, it is possible to create whole new sets of data and to store them in tables, and / or it is possible to update complete sets of records in a single operation. Access implements this functionality as ‘append’ and ‘update’ queries – see below, section 7.11.1.



Occasional need for record-at-a-time navigation and processing Access usually manipulates records a set at a time. Sometimes, it is necessary to carry out record-at-a-time navigation and processing under program control. This is achieved in Access by means of:



Recordsets: the Access mechanism for making tables and queries available a record at a time



Visual Basic: the language in which you can manipulate individual records

ESC students should not normally try to learn how to do this.

7.11. System data transformations Access provides the following explicit ways in which to take stored data and either to change the way in which it is stored, or to transform it under program control:

7.11.1.Append and Update queries These are the user-accessible means by which Access provides set manipulation facilities, that is, it is possible to create whole new sets of data and to store them in tables, and / or it is possible to update complete sets of records in a single operation. They are not described further in this document.

7.11.2.

Macros

Macros are stored sequences of user commands.

7.11.3.Visual Basic for Applications (VBA) modules inside Access Program code can be linked to objects within a database, such as forms, etc. This program code is written in VBA.

7.11.4.Visual Basic programs outside Access A program written in Visual Basic can manipulate data stored in a Microsoft Access database. The facility to do this is called Access Data Objects, ADO. This document does not cover these topics in any systematic manner. ESC students should not normally attempt to learn programming.

Page 73 of 150

8.

Ways in which to learn more MS Access 8.1.

Sample databases and applications included with Microsoft Access The material in this section is based on, and quotes from, the MS Access help files. Microsoft Access provides a sample database that you can use while you're learning Microsoft Access.

8.1.1. NorthWind Traders sample database (English edition) / Les Comptoirs (édition française) Use this sample database when you're first learning Microsoft Access. The NorthWind database contains the sales data for a fictitious company called NorthWind Traders, which imports and exports speciality foods from around the world. By viewing the tables, queries, forms, reports, macros, and modules included in the NorthWind database, you can develop ideas for your own Microsoft Access database. You can also use the NorthWind data to experiment with Microsoft Access before you enter your own data. For example, you may want to practice designing queries using the Orders table since it contains enough records to produce meaningful results. A French version of the same application can be built using the French-language edition of Access. It is entitled Les Comptoirs. It is not quite as complete as the American English version. We have created an improved version of NorthWind which overcomes some of the weaknesses in its original design and implementation by Microsoft.

8.1.2. Database Wizards (Assistants) Microsoft Access also includes a database wizard (assistant) that you can use to create common databases, such as a Contact Management database8. You can use the databases created by the database wizard as-is or as a learning tool to help you design your own databases.

8

However, please note that this Contact Management system will NOT meet the requirement set out in section 4.3.1!

Page 74 of 150

SECTION 3 – THE ANYTOWN DISTANCE LEARNING BUSINESS SCHOOL EXAMPLE 9.

Example scenario: Anytown Distance Learning Business School The Anytown Distance Learning Business School offers general business courses at undergraduate and postgraduate levels. The undergraduate course is a Bachelor of Arts (BA) course called Business Studies. The postgraduate course is a Master of Business Administration (MBA). Each course is administered by a Course Coordinator. Students apply for a course, BA or MBA.9 They send in an application form containing their personal details, and their desired course. On behalf of the School, the appropriate Course Coordinator checks whether the course is available and that the student has already obtained the necessary academic qualifications. If the course is available (not yet full) and the student is qualified, he or she is enrolled in the course, and the School confirms the enrolment by sending a confirmation letter to the student. If the course is unavailable or the student is not sufficiently qualified, the student is sent a rejection letter.

10.

Background: Studying The academic year is divided into two teaching semesters, the first of which runs from October to January, the second from February to May. At the start of a semester students must register for the modules they will be taking in that semester. These are the core modules for that semester of the course, and the electives chosen by the student. All modules last only one semester. There is a third semester, over the summer, running from June to September. No ordinary modules are taught in the third semester. However, MBA students do their dissertation in the third semester. Once the student is enrolled on a course, they have to study modules, each of which has an associated credit. The credit for an undergraduate module is 10; that for a postgraduate module is 15. To be awarded a BA, a student on the BA course has to achieve 360 credits, and this is normally achieved by the student studying six 10-credit modules per semester for each of two semesters per year for each of 3 years: 6 X 10 X 2 X 3 = 360. On the MBA, students study four 15-credit modules per semester for each of two semesters in a single year: 4 X 15 X 2 = 120. They then undertake a single dissertation (project) in the third semester; the dissertation is for 60 credits. A student is awarded an MBA when he or she has achieved 180 credits. The table summarises the structure of each course:

Course

Modules per semester

Credits per module

Taught Semesters per year

Number of years

Sub-total

Project

Total credits

BA

6

10

2

3

360

0

360

MBA

4

15

2

1

120

60

180

The student chooses the modules they will study at the start of each semester. A module is taught by a module leader. Each module is assessed by coursework and by an exam, in varying proportions – one module might be 60% exam, another 50%. A student passes a module if the mark for the module as a whole exceeds 40%. If they pass, they are awarded the module credits; if they fail, they are awarded zero credits. If a student fails a module, he or she has to take an alternative module in a later semester, so that they can obtain sufficient credits. Study is by means of Distance Learning. Coursework is submitted by email. The students do not need to come to Anytown, except when they have exams to do at the end of the first and second semesters.

11.

A Closer Look into "Managing Students" This section focuses on the part of the system that supports the administration of information about students on courses and modules.

9

Note that course in this case study is neither programme nor module – but, as we will see, it is closer to programme than module. Page 75 of 150

• Students study modules drawn from two lists of modules held for the School, one of undergraduate modules, the other of postgraduate ones. • Modules have titles and a unique identifying code. Each module has a pre-defined value expressed as a number of credits. • Modules are of two kinds - some modules are core, some are electives, that is they are optional. • Core modules must be taken by all students on the course. The course regulations will specify how many optional modules a student can take and what these options might be. • Students who pass a module are awarded the number of credits specified as that module’s value. If they fail, they get zero credits. • Students construct a programme of study by doing core modules, to which are added the optional (elective) modules they select from those available. • Every course defines a maximum period of enrolment within which time the course must be completed. This is normally five years for an undergraduate course and three years for a postgraduate course. If a student does not complete within this time, the decision of the next exam board will be that they have failed the course. • Students may suspend studies or withdraw from the course. The date on which this happens must be recorded. • Each component (coursework or exam) of a module has a certain percentage weighting and a student’s overall mark for a module is calculated by combining the marks for each component. • An exam board (jury) meets after each semester to consider the marks obtained by students and to determine whether they have passed or failed the modules they were registered for, and what their status on the course now is. This process is described in more detail in section 12.

12.

The process of deciding what happens to students At the end of every semester, the module leaders and course coordinator meet together in an exam board (jury). The exam board is chaired by the Dean of the School, who represents the School’s management. The exam board firstly looks at the results for each module, checking to see that they are reasonable and sensible – that is, that the marks awarded are neither too high nor too low. If there is a problem, all the marks awarded are scaled up or down by a percentage decided by the board. The exam board is then presented with a list of all the students on the course. For each student, there is a report, called a Student Results Summary, which shows what modules the student has undertaken and the results they have achieved. An example of this report is given below in section 19. The jury then decides for each student whether they have: 1. Passed all the necessary credits, including all the core modules for the course and at least the necessary number of options from the course’s collection of optional modules; in this case, the decision is that they have succeeded in the course and they are awarded a BA or MBA. 2. Not yet passed all the necessary credits, but are making satisfactory progress: the decision is that they may proceed, taking further credits as necessary. 3. Are not making satisfactory progress, that is, they are failing to complete too many modules or have exceeded the maximum time they may stay on the course: the decision is that they have failed in the course as a whole. After the exam board, a revised version of the Student Record is printed and sent to the students.

13.

Course Review Once a year, a Course Review meeting is held at which each course is reviewed to see whether the list of modules is still appropriate, or whether some modules have become obsolete, or whether new modules need to be devised. The same meeting has the power to change the relative weightings of the coursework and exam components, and to decide which modules are core on a course and which are electives. Module leaders can propose changes to module specifications. The Dean can propose changes to the programme itself.

Page 76 of 150

14.

Simplifying Assumptions This database does not contain full details of qualifications. 10 So data about the following things is simply stored as large text fields (Access Memo type), because it does not need to be queried: Course Applicant / Student

Qualifications Required Previous Qualifications

Each module has only one teacher, and that teacher is the module leader. One teacher may however be the module leader for a number of modules.

15.

External entities These are:

16.



Applicant



Student



Course Coordinator



Module Leader



Management – the Dean

Processes This list is not necessarily complete.

16.1. Process Applicants Decide whether to accept or reject students, on the basis of their qualifications and the availability of spare places on the course.

16.2. Admit students to Course – Course Enrolment Details of the applicant are transferred to the Student table.

16.3. Register students on core and optional modules This is done at the start of each semester. Students fill in a form stating which optional modules they wish to do and confirming the core modules that they have to do.

16.4. Teach and assess a module Each module lasts one semester.

16.5. Prepare for and hold exam board (jury) •

Collect together the results for all students for all modules they have been studying



Review module results in exam board



Decide student status in exam board

16.6. Review Course This is described in section 13.

10

For a more thoughtful approach to how to manage qualifications, please see section 2.22 Page 77 of 150

17.

Documents The Course Coordinators currently produce and maintain the following documents:

17.1. Course Description Produced for each course, this describes the course, says what qualifications students have to have, and lists the core and elective modules. It is updated for and as a result of the Course Review meeting (section 13). The Course Description is a report, it is not an entity.

17.1.1.List of Modules For each module, the following data has to be kept:



Module Code



Module Title



Course Code – The Course on which the module is used – a module is used either on the BA or the MBA



The Lecturer who is the Module Leader



Elective or Core?



Examination weighting %

17.2. Management Reports The following statistics and analyses would be of value to management:



18.

Analysis of the results for each module – average mark, standard deviation, percentage of students who do not pass the module.

Entity and Attribute Lists Some initial analysis work has been undertaken; however, please note that it is not complete, and you are expected to add to it. Some of the entities are Applicant, Course, Lecturer (module leader), and Student. Some of the attributes for some of these entities are:

Applicant Applicant No Date of Application Applicant Name Applicant Address Applicant Country Actual Qualifications (stored as a Memo field)

Course Course Code Course Name Course Description Qualifications Required (stored as a Memo field)

Lecturer Lecturer No Lecturer Name Home Address

Student Student No Course Code

(Foreign key) Page 78 of 150

Student Name Student Address Date of Birth Previous Qualifications (stored as a Memo field) Status (Applicant / Enrolled / Passed / Failed / Withdrawn / Suspended / Progressing)

19.

Example Student Record Report This is an extract of the information presented to the exam board for one student; it is later printed and sent to the student.

Page 79 of 150

20.

Anytown high-level Use Case diagram Please note that the label can also be written « include ». Note also that Microsoft Visio employs or « uses » instead of > - they mean the same thing.

Use Case diagram for Anytown

Input applicant details

Applicant Confirm applicant as student

Record student module choices

Print module results for jury

Course coordinator Print student results for jury

Student

Update student status

Jury

Print student results letters Record new staff member

Submit coursework Change course structure «uses» Module leader Input module results

Update module «uses» Dean Allocate teacher

Request management reports

Page 80 of 150

21.

Anytown: Context diagram Proposed changes to module specification Applicant

Module leader

Revised module specification

Application Acceptance or rejection

Module results

Proogramme co ordinator

Decision Application

Course description

University of Anytown Student System Management reports

Dean / Management

Module choices

Potential changes to programme Coursework and exams for assessment Student

Student results letters

Page 81 of 150

22.

Level 1 DFD

Applicant

Module marking

Module leader

Application

Revised module specification

Student status Acceptance or rejection

4 D4

Prepare for and hold exam board

Module results

Proposed changes to module specification

Decision 1 Course co-ordinator

7

Process applications Application

Produce management reports

5

6

Teach and assess module

D5

Module & Course specs

Review Courses and modules

Management reports D1

Applicant details

D3

Module registrations

Potential changes to Course Semester results letters

2 Admit students to course

Coursework and exams for assessment

3 D2

Students

Register students on core and elective modules

Dean / Management

Course descriptions

Module results letters

Course co-ordinator Student

Module choices

Page 82 of 150

23.

Example Level 2 DFD

Prepare for and hold exam board

4

D4

Module results 4.1

Changed module results Revised module results

Collect module results and produce student profile Initial module results

D6

Student profiles

4.2

Review module results

4. 4

Print results letters 4.3

Decide student status

Student result letters

D2

Students Students

Page 83 of 150

24.

Data dictionary We now need to move towards a good ERA model by means of top-down entity attribute modelling. The approach I have adopted here is to work on the basis of the list of "obvious" entities which I identified in section 18, put them into a spreadsheet, and gradually add the appropriate attributes. The spreadsheet is an extended example of what is sometimes called a Data Dictionary.

24.1. Data dictionary for Anytown Business School Data Dictionary for Anytown Business School

Description

External entities Applicant Student Course Coordinator Module Leader Dean

Processes

Process (P) or Sub process (S)? Process No

Process applications

P

1 Process applications

1

Admit students to course

P

Admit students to 2 course

2

Process name

Subprocess No

Sub-processes

Page 84 of 150

DFD number

Register students on core and elective modules

P

Register students on core and elective 3 modules

3

Prepare for and hold exam board P

Prepare for and hold 4 exam board

4

Collect module results and produce student profile S

Review module results

4

Collect module results and produce 1 student profile

4.1

S

4

Review module 2 results

4.2

S

4

Decide student 3 status

4.3

S

4

4 Print results letters

4.4

Teach and assess module

P

Teach and assess 5 module

5

Review programmes and modules

P

Review programmes 6 and modules

6

Decide student status Print results letters

Page 85 of 150

The scaling factor (if any) is applied to the recorded student results before the Student Results Summary is reprinted

Produce management reports

P

Data Stores

Store name

Produce management 7 reports

7

1 Applicants 2 Students 3 Module registrations 4 Module results 5 Module specifications 6 Student profiles 7 Course specifications

Data Flows

External entity

Process No Direction

Name of flow

Applicant

1

Inward

Application

Applicant

1

Outward

Acceptance or rejection

Course Coordinator

1

Outward

Application

Course Coordinator

1

Inward

Decision

Page 86 of 150

Process name Process applications Process applications Process applications Process applications

Course Coordinator

6

Inward

Course description

Review programmes and modules

Student

3

Inward

Module choices

Student

4.2

Inward

Coursework and exams for assessment

Student

5

Outward

Module results letters

Student

4.4

Outward

Student results letters

Register students on core and elective modules Review module results Teach and assess module Print results letters

Inward

Proposed changes to module specification

Review programmes and modules

Outward

Module specification as revised and agreed

Review programmes and modules

Module results

Review programmes and modules

Outward

Management reports

Produce management reports

Inward

Proposed changes to programme

Review programmes and modules

Module Leader Module Leader Module Leader

Dean Dean

6

6

6

7

6

Inward

Page 87 of 150

Analysis of the results of each module; course description; list of modules

Entities

Attribute

Primary? Y or C

Foreign?

Domain Type Size

Validation Format Input mask Rules

Applicant / Student Student number Student forenames Student last name

Y

Text

11 >

Text

30

Text

20 >

Date of birth

Date/time

Student address line 1

Text

20

Student address line 2

Text

20

Text

50 >

Student city Student postcode Student country Course code Application date Enrolment date

Y

Date

Text

8

Text

32

Text

3

LLL00000000

00/00/0000

Date/time

20 Date

00/00/0000

Date/time

20 Date

00/00/0000

Page 88 of 150

Description An applicant becomes a student when they are enrolled

Finishing date

Date/time

20 Date

Status

Text

12

Term address line 1

Text

20

Text Text

20 50 >

Term address line 2 Term city Term postcode Contact details Previous qualifications

Text

8

Text

60

00/00/0000 Applicant / Enrolled / Passed / Failed / Withdrawn / Suspended / Progressing

Defaults to Anytown

Memo

Employee Employee number Employee forenames Employee last name

Y

Text

7 >

Text

30

Text

20

Page 89 of 150

LLL0000

e.g. EMP1234

Employee role

Text

20

Social security number

Text

16

Employee address line 1

Text

20

Text

20

Text

50

Text

8

Text

32

Employee address line 2 Employee city Employee postcode Employee country

Programme

Employee contact details Level Credits per module Project credits Credits required

Y

Memo Text

E.g. telephone numbers, etc. 1 >

Integer

Text

P/U

60 if postgrad 360 if undergrad; 180 if postgrad

Integer

Y

L

10 if undergrad; 15 if postgrad

Integer

Course Course code

Course coordinator / module leader / Dean

MBA / 3 BA Page 90 of 150

Course name Course coordinator Level

Y Y

Text Long Integer Text

Required qualifications

Memo

Max number of students

Integer

Normal number of years

Integer

Max number of years Module value

40

1

Integer Integer

Modules per semester

Integer

Taught semesters per year Description

Integer Memo

Module Module code Module title

Y

Text

4

Text

1 Page 91 of 150

BA Business Studies or Master of Business Administration Employee who manages the course

Course code

Y

Specification

Text Hypertext link

1

A Module Operation is the operation, or running, of a module in a given year

Module operation Module code Year

C C

Y

Core / elective / obsolete? Examination weighting Teacher Scaling factor applied Registration Result Module code Student number

Text Text

4 4

Text

C/E/ 1 O

L

Integer Text

% 7 >

LLL0000

Single

C

Y

Text

C

Y

Text

Scaling factor, in %, decided by jury

%

4

L000

11 >

Date course work received Course work mark

Date/time Integer

%

Exam mark

Integer

% Page 92 of 150

LLL00000000

Overall mark Module result

Relationship s

Integer Text

Parent

Relationship Child

Degree

Programme

Includes

Course

1:M

Course

Enrols

Applicant / Student

1:M

Course

Consists Of

Module

1:M

Employee

Coordinates Course

1:M

Employee

Leads

Module

1:M

Module

Results In

Registration Result 1:M

Applicant / Student

Is Registered On

Registration Result 1:M

% 4

Pass / fail

Description A Course is part of either a Postgraduate or an Undergraduate Programme An application is made by an Applicant for a Course. If they are acceptable, they may Enrol on the Course. They are then a Student on that Course A Course is delivered as a series of Modules An Employee whose role is Course Co-ordinator, coordinates the Course An Employee whose role is Lecturer, Leads the Module Registration Result resolves the many-tomany relationship between Module and Student Registration Result resolves the many-tomany relationship between Module and Student Page 93 of 150

Module

Use Cases Input applicant details Confirm applicant as student Register student on modules Input module results Print module results for jury Print student results for jury Print student results letters Change course structure Change course structure Update module Update module Update student status Record new staff member Submit coursework Request management reports Allocate teacher

Runs as

Module operation

Actors Course coordinator Course coordinator Student Module leader Course coordinator Course coordinator Course coordinator Dean

Module leader Dean Jury Dean Student Dean Dean

1:M Relationships Used by actor Used by actor Used by actor Used by actor Used by actor Used by actor Used by actor Used by actor Includes Used by actor Includes Used by actor Used by actor Used by actor Used by actor Used by actor

Other Use Cases

Description

Update module Module, and Module Operation Allocate teacher

System Outputs Reports

Description Average mark, standard deviation, percentage of students who have not passed See section 10 of scenario

Analysis of the results of each module Student Results Summary Page 94 of 150

Forms

SubForms

Queries

Description Description

System Inputs Forms Applicant details Student details Record student module choices Update course structure Update module Update student Update member of staff Coursework receipt

SubForms

Description

Programme and module details Module and Module Operation details Registration results

Validation Checks (NB: this is intended for inter-entity validation checks, and there are none in this particular scenario)

Page 95 of 150

Description

25.

Anytown ER diagram Entity Relationship model for Anytown

Programme

Includes

Course

Coordinates

Employee

Consists of

Module

Leads

Runs as

Enrols

Module Operation

Results in

Registration / Result

Is Registered on

Applicant / Student

Page 96 of 150

26.

Anytown system implementation In order to use the analysis and design work we have already undertaken, you would begin to translate the ERA model (data model) into equivalent Access objects. Therefore, entities become tables, attributes become fields, and relationships are defined as relationships! Similarly, the Use Case diagram has already been used to identify inputs and outputs indicated in the dictionary above. Implementation in Access involves converting these into equivalent forms and subforms. You might like to try this for yourself (because we have not uploaded an Anytown database). Over to you to try…

27. Terminology associated with data modelling and database design Unfortunately, both for historical reasons, and for others, a variety of complex vocabulary has built up in the area of database design, and specifically of normalisation. The literature of normalisation generally follows the vocabulary established by Edgar Codd. This is based on a specialised branch of mathematics, and is frankly obscure. I have therefore deliberately changed the vocabulary in the rest of this document to follow the Entity Relationship Attribute approach we have been using so far. It is possible that you will encounter other vocabularies, and therefore I have set out the equivalences in the table below. The only ones you need to know are the column headed ERA, and that headed Access. OOAD stands for object oriented analysis and design, a more recent approach which we do not teach in the School. Meaning Class of object, thing Instance of object, thing Property, fact about Relationshi p

File File

Spreadsheet (worksheet)

ERA Entity

Access Table

Codd Relation

OOAD Class

Record

Row

Entity occurrence

Record

Tuple

Instance

Field

Column

Attribute

Field

Attribute

Attribute

(none)

Relationship

Relationshi p

Relations hip

Associati on

Set of all possible values Operation

(none)

[achieved painfully by lookup functions] (none)

(none)

(none)

Domain

Domain / type

(none)

Formula

(none)

Module

(none)

Degree (1:M etc.)

(none)

(none)

Degree

Degree

Cardinalit y

Operation / method Multiplicit y

Page 97 of 150

28.

References 28.1. Basics of structured analysis There are a myriad number of books on basic "structured" systems analysis and design techniques. One which I would recommend for its combination of cheapness and accessibility is: Hughes, Martin Mastering systems analysis and design Basingstoke: Macmillan, 2000 ISBN 0-333-74803-4 Out of print. The following book is excellent for setting the techniques used in this document in a business-oriented context: Curtis, Graham & David Cobham Business Information Systems: Analysis, Design & Practice 6ed Financial Times / Prentice Hall, 2008

28.2.

Database theory

The classic reference for students who really want to understand database theory is the book by Chris Date - Date was a collaborator with Dr. Edgar Codd, who invented the relational data model, until the death of the latter in 2003: Date, Chris An introduction to database systems (8th edition) Reading, MS: Addison-Wesley, 2003 This book is frankly difficult at first encounter, but it remains the classic reference on relational database.

28.3. DataFlow Diagrams (DFDs) For a tutorial on DFDs, see http://www.cems.uwe.ac.uk/~tdrewry/dfds.htm

28.4. Entity relationship modelling Entity relationship modelling was originally proposed by Peter Chen in the seminal article Chen, Peter The entity-relationship model—toward a unified view of data ACM Transactions on Database Systems (TODS) archive, Volume 1 , Issue 1 (March 1976) - Special issue: papers from the international conference on very large data bases: September 22-24, 1975, Framingham, MA. ISSN:0362-5915. I use a simplified version of his notation in this document. For an additional online tutorial about entity relationship modelling, see http://www.cems.uwe.ac.uk/~tdrewry/lds.htm

28.5. Use Case Wikipedia (accessed 26/02/2008) has a useful summary of Use Case diagrams http://en.wikipedia.org/wiki/Use_case_diagram See also http://www.systemsanalysis.org.uk/ accessed 15/06/2008. Page 98 of 150

28.6. Basics of Object Oriented Analysis and Design (OOAD) The structured approach is not the only one used in industry – the more recent object oriented analysis and design approach is described in books such as this one. OO is conceptually more difficult than the older structured approach. But if you need a textbook on this more recent approach, I recommend the following - although it is NOT suitable for the basic "structured" systems analysis and design techniques used in this document: Bennett, Simon & McRobb, Steve & Farmer, Ray Object-oriented systems analysis and design using UML 3/e Maidenhead: McGraw-Hill, 2006 £44.99 A good textbook for initial study of object-oriented analysis and design. Rather long-winded in places.

Page 99 of 150

29.

Appendix 1 Business Process Analysis using Use Case Analysis With thanks to Dr. Ken Lunn, former colleague at the University of Huddersfield, whose material has formed the main basis for this section. A Use Case is a definition of a meaningful interaction with a computer system. If you have used the internet to buy things, an example of a Use Case would be choosing something from an online catalogue, and another might be paying for the goods. Use Case modelling is part of requirements definition and systems analysis. At the high level, a set of Use Case diagrams define the presentation of the system, and these are excellent tools for discussion with stakeholders of a system, such as users and sponsors. At a more detailed level, Use Cases are used to fully specify the external functionality of a system. Use Cases are part of the information required by developers to design and implement a system. Use Case diagrams say "what" a system does. The detailed analysis of Use Cases begins to say something of "how" the system behaves in an environment. However, it does not say "how" a system is structured internally to provide that behaviour. In computer system development you will frequently see this separation emphasised. Before you decide how a system works, you need to determine what it does first - a simple and obvious rule, but one so often forgotten to many people's ultimate regret. That’s why Use Case diagrams (UCDs) and Use Case models (UCDs with supporting text documents) can be so useful.

29.1. What is a Use Case Diagram? A Use Case Diagram models a complete business process. It consists of three key elements: 





ACTORS: People or things that use a system. An Actor might be a clerk in a business, a manager, or even a customer accessing a system via the Internet. Other computer systems are also called Actors. A banking system that you send information to might well be an Actor in your system. USE CASES: A Use Case is a meaningful piece of functionality provided by a computer system. It can be quite complicated. Examples would be printing invoices, accepting payment, ordering goods. Use case: A use case in a use case diagram is a visual representation of distinct, identifiable (nameable) business functionality in a system. To choose a business process as a likely candidate for modelling as a use case, you need to ensure that the business process is discrete in nature. Discrete means separately and clearly-identifiable. As the first step in identifying use cases, you should list the discrete business functions in your problem statement. Each of these business functions can be classified as a potential use case. RELATIONSHIPS: These are links between Actors and Use Cases. Actors use Use Cases, and also Use Cases can use other Use Cases.

We draw a Use Case as an ellipse with the name of the Use Case underneath: Page 100 of 150

Sometimes the name is put inside:

A Use Case

A Use Case

The Use Case name is a concise, active description of the behaviour carried out by the Use Case, such as "print invoice". Do not write mini-essays to describe the behaviour of the Use Case - we shall use a more elaborate means for describing the behaviour in full. An Actor is drawn as a stick person:

An Actor This is rather an unusual choice of notation when it is an external computer system, but you will get used to it. An Actor is really a role, not a person. One person may use the system under many different roles. When finding actors, you are looking for the roles that people adopt, not the people or even the job titles. Relationships are drawn as lines, usually with an arrow:

A Use Case

An Actor

This means that an Actor uses the Use Case. In any relationship there will be two way communications. The direction of the arrow indicates who initiates the interaction. Often in an interactive system, it is the Actor that initiates the dialogue, but it can be the Use Case. Sometimes the arrow is left out. A Use Case can use another Use Case. If you have a piece of well-defined functionality, it makes sense to re-use this wherever possible. Also, sometimes a Use Case gets too big to manage sensibly and it makes sense to break this down into smaller Use Cases. There are two ways Use Cases can relate. The first is where a Use Case "includes" another Use Case. In this case the second Use Case is always invoked as part of the execution of the first. This is drawn with an arrow pointing to the Use Case that is included, with the label tagged to the line:

Page 101 of 150



Add item to sales ledger

Change Invoice

Please note that the label can also be written « include ». Note also that Microsoft Visio employs or « uses » instead of >. Sometimes a Use Case is only called occasionally from another Use Case. From the scenario analysis of the business, this will often be to support an alternative path or an exception. We draw this with an arrow pointing the other way (yes it is confusing at first) where the arrow points to the calling Use Case. So below, Chase Payment sometimes calls Issue Warning Letter.



Chase Issue Warning Payment Letter So now we have the building blocks for a sophisticated description of a system's external behaviour. Let us look now at an application that manages payments for customers. A credit controller might be able to print invoices, chase payments, process payments, correct invoices, and correct deliveries and register bad debts using a computer system. Part of chasing payment may involve either issuing warning letters, where the computer system prints one off, or telephoning the customer, where the computer system provides a means of the controller logging the results of the conversation.

Page 102 of 150

Print Invoice

Customer Issue Warning Letter



Chase Payment

Telephone Reminder Process Payment

Correct Invoice

Credit Control Clerk

Receive Payment



Correct Delivery

Receive Cheque

Register Bad Debt



Receive Bank Credit

With a Use Case Diagram like the one above, you are getting a clear picture of who uses a system, and what they can do with it. You also have forced some decisions, and provided some external structure to the system.

29.2. What to do if a use case diagram won’t fit on a single page? Answer: split the diagram over several pages, with each page corresponding to a single area of business activity.

29.3. Finding Use Cases The first stage of analysis is to map out the business using:   

A High Level Business Activity Model, that breaks a business down into a simple, three level hierarchy of activities. Scenario Analysis of Business Activities, using primary and alternative path analysis. Construction of Business Processes or Business Workflows from the Scenario Analysis, and described using Activity Diagrams. The use of these techniques is not taught on this module, nor is it described in this document. That is because the scope of activity for the kind of systems described in the rest of the document is assumed to be relatively small scale, Page 103 of 150

in relatively clear-cut situations. So you should already know what the business process is, and you are simply trying to improve it. Once you have a business process fully defined, you go around all the activities asking the simple question "is there a potential use of a computer system here?" Sometimes you may need no system support in an activity in a business process, sometimes the need for one use case, sometimes many use cases are needed for a particular business activity. We are now beginning to see the rudiments of a methodology emerging. It starts in the business arena, describing the business in some detail. Then it starts to think about where computer systems are used. The first thing to worry about is what the system does, and how it fits in to the business, not how it does it in detailed technical terms.

29.4. Naming Use Cases This section summarises the document “What Makes a Good Use-Case Name?” by Dr. Use Case (aka Leslee Probasco, Rational Software Canada) to be found at http://download.boulder.ibm.com/ibmdl/pub/software/dw/rationaledge/mar01/WhatM akesaGoodUseCaseNameMar01.pdf accessed 15/12/2008. “Auction” is a poor choice of use case name. Is it a noun or is it a verb? Both. Having clear and meaningful use-case names is very important; it's worth spending the time on up front to get them right. Why Should We Care About Use-Case Names, Anyway? Why should we care so much about what name we give a use case? When defining the requirements that a part of the business needs, the project team and customers must agree on scope definition and cost and schedule estimates. Ultimately they must make the decision to either proceed with the project or to cancel it (one of the objectives of the inception phase during which UCDs are initially created). Often the only information available about the identified actors and use cases are their names. Along with specified features and other system requirements, this must be sufficient for all stakeholders to have a clear enough understanding of the functionality of the system in order to make this critical "thumbs-up" or "thumbs-down" decision. Naming use cases should enable anyone (at least anyone familiar with the problem domain) to be able to look at a use-case diagram -- noting the actor and use case names, and their associations - and have a pretty good idea of the value or goal to be achieved by each use case. To accomplish this, it is very important to choose the names of all actors and use cases with this objective in mind. The "Golden Rule of Use-Case Names" suggested by Probasco comes from the Rational Unified Process RUP. This states that "each use case should have a name that indicates what value (or goal) is achieved by the actor's interaction with the system" (if all goes as expected! :>). Here are some good questions to help you adhere to this rule: • Why would the actor initiate this interaction with the system? • What goal does the actor have in mind when undertaking these actions? •

What value is achieved and for which actor?

The preference is to have a use-case name that begins with a verb – just as in a To-Do list. For an ATM system, a to-do list might include actions such as "Withdraw Cash," "Transfer Funds," "Service the ATM," "Deposit Funds,". Names like "Cash Withdrawal," "Funds Transfer," "ATM Service," and the like do not follow this rule. Page 104 of 150

So: All use-case names should indicate what value or goal is achieved by the actor(s)' interaction with the system and must be stated in the active form, beginning with a verb.

29.5. Describing Use Cases The high level Use Case Diagrams above are fine for a “mile-high”, or “birds-eye”, view of the computer system’s behaviour. For many stakeholders, such as sponsors and managers, this will be enough. However, the analyst who thinks the job is done has a rude awakening. Once you have defined the use cases at the high level, a lot of work may still be necessary to open these up and define them in detail. At the very least, the Use Case may need to be described in some detail in a text document which explains the use case diagram. Sometimes people distinguish between the diagram alone, the UCD, and the use case model, which is the UCD and other supporting documentation. Now we know what the system presents to the various users (or actors), we need to define in fine detail the "how" of that interaction. Use cases can be further defined as detailed sequences of behaviour. This is beyond the scope of this simple introduction.

29.6. Using Use Cases to identify System Inputs and Outputs At every point on a Use Case diagram where an arrow connects a (human) actor to a Use Case, one or more Forms, Reports or Web Pages are needed to input data to the process or to output information. You can therefore make a list of these of these interactions, and state what means (Form, Report or Web Page) is appropriate to that interaction.

29.7. Other resources for learning about Use Cases There’s plenty of material about use case modelling on the Web, but much of it is unnecessarily complicated because it is described as part of the full UML language which business students should not learn. One reference is http://www.parlezuml.com/tutorials/usecases/usecases_intro.pdf accessed 24/11/2008 and written by Jason Gorman.

Page 105 of 150

30.

Appendix 2 Data Flow Diagrams This appendix summarises the steps required in creating a dataflow diagram. Examples of context, level 1 and a single level two diagram are given earlier in the document, in the University of Anytown business school case. Creating a DFD is a process also called Data Flow Modelling: the process of identifying, modelling and documenting how data moves around an information system. Data Flow Modelling examines processes (activities that transform data from one form to another), data stores (the holding areas for data), external entities (what sends data into a system or receives data from a system), and data flows (routes by which data can flow).

30.1. What are Data Flow Diagrams (DFDs)? ♦

DFDs are diagrams that show how data flows within a system



They are initially created as two simple HIGH LEVEL diagrams: the Context Diagram and the Level One Data Flow Diagram (DFD)



Parts of the high level diagram may then be ‘exploded’ (alternatively, expanded or zoomed) to show more detail



DFDs represent the flow of data between different processes within a system, together with the original flows of data into the system and the output flows of information from the system to its users, clients or other consumers





The idea is that the diagrams be simple and intuitive, not focusing on details



They should describe what the users of systems do, do rather than what computers do

Limitations: ∗

Focus only on flows of information (not physical flows of goods)



The diagrams therefore ignore flows of materials



Nor do they show how a process works: its decision points (if this, then this, otherwise this) or repetitions – such diagrams, sometimes called flowcharts, are NOT a part of the approach documented in this booklet

30.2. Why use Data Flow Diagrams? ♦

They are a technique used in structured analysis, that is, a tool used by the systems analyst for analysing requirements



They are also a communications aid ∗

Between user and analyst



Within an analysis team Page 106 of 150



The diagrams can be used to solve disagreements about how work is being done or how it should be done in the future

30.3. What is a DFD? Main elements ♦

Context diagram



Sources and Recipients of information (“external entities”)



Main system inputs and outputs



Main data stores

30.4. The components of a DFD A complete DFD is a hierarchical set of diagrams:





One Context diagram ∗

Shows the system boundary by indicating the name of the main process in the system, and the External Entities which lie outside the system



Summarizes the main data flows into and out of the system



The Context diagram defines the system boundary: what is a part of the system under study (and what is outside it); this also sometimes called the scope of the system



Defines data flows to, and information flows from, a system



Identifies the data flows and stores within the system



Identifies the functions (processes) performed by the system

Levelled set of DFDs ∗

One Level 1 DFD, which identifies the business processes and breaks them down into subprocesses



Several (2 to 10, typically 7) Level 2 DFDs



Main system functions (processes), the key functions, appear as Level 1 DFDs



Where necessary, explode level 1 DFDs into component level 2 DFDs ∗

This means that there are parent and child processes



Consider parent as a window onto its children



Possible to look at a process at any level of detail, so may need Level 2 diagrams, or even Level 3 Page 107 of 150



All key documents should be identified as data flows

Structure of a Data Flow Diagram showing that it is a hierarchical set of diagrams

,

Context University of Anytown Student System

There is one and only one Context diagram in a DFD

L1 DFD University of Anytown Student Applicant System

L2 DFD process Process applications

There is one and only one Level

L2 DFD process 4 Prepare for and hold exam board

1

L3 DFD process

1

L3 DFD process

2

Collect module results and Review moduleof results NB: this is NOT a DFD; this diagram shows the STRUCTURE a DFD! produce student profile

The L 1 DFD identifies between two and seven /eight main processes ; each such main process may be expanded into a Level 2 DFD (here , only two are shown

L3 DFD process 3 Decide student status

30.5. What appears on a DFD? There are only FIVE kinds of thing on a DFD:



Processes A Process takes data in and processes it to create output data or information; the inputs are modified or transformed in the process of generating the outputs. Processes can often be identified in the real world with actions undertaken by individuals or whole departments. For example, a sales representative is part of a process Take Order. That process may itself be a part of (a subprocess of) the larger process, Fulfil Order.



External entities An External Entity is a source of data to the system as a whole, or a client or consumer of information (processed data) produced by the system. They are outside of the system being modelled. External Entities are terminators which indicate where data comes from and where output information goes to. In designing a system, we have no idea about what these terminators do or how they do it.



Data stores A Data Store is a place where data is stored: typically, a folder in a filing cabinet (a paper file) or a file in a computer. Data Stores represent a place in the process where data comes to rest. A DFD does not say anything about the relative timing of the processes, so an example data store might be a place to accumulate data over a year for the annual accounting process



Data flows ∗

Data flows in from an external entity, another Page 108 of 150

1 diagramin a DFD

The L 2 DFD identifies between two and main processes ; each such main pro expanded into a Level 3 DFD (he

process, or a data store ∗

Data flows out to an external entity (in this case it is information), or to another process or a data store

Data flows may sometimes be obvious in the real-world as documents, such as order forms (a data flow into a Fulfil Orders process) or invoices (a data flow out from a Bill Customers process).



Elementary process descriptions A process which is relatively simple and straightforward is NOT broken down into sub-processes; instead, it is described in a paragraph or so of text called an Elementary Process Description

30.5.1.Listing the elements of a DFD Analysts normally identify and list the main elements of a DFD on paper or in a spreadsheet before they go on to make the actual diagram. This list of elements is sometimes formally maintained as a Data Dictionary.

30.6. The Data Flow Diagram Symbols – SSADM Notation

A Process box

1 Process Description D1

The Number and Description is the same as in the Elements List (data dictionary)

A Data Store

Name of Data Store The Number and Description is the same as in the Elements List Source or Destination

Source/Destinatio n Arrows show DATA FLOWS

30.7. Making a Data Flow Diagram: a Top-Down Approach ♦

The systems analyst makes a context level DFD, which shows the interaction (data flows) between the system (represented by one process) and the system environment (represented by external entities).



The system is decomposed in a lower level (Level 1) into a set of processes, data stores, and the data flows between these processes and data stores.



Each process on the Level 1 diagram may then decomposed into an even lower level diagram (Level 2) Page 109 of 150

containing its subprocesses. ♦

This approach then continues on the subsequent subprocesses, until a necessary and sufficient level of detail is reached which is called the primitive process.



A primitive process is briefly described in natural language (English, French etc.) This description is sometimes called an Elementary Process Description.

30.8. The elements of a DFD Every page in a DFD should contain fewer than 10 components. If a process has more than 10, exceptionally 20, components, then one or more components (typically a process) should be combined into one and another, lower-level, DFD be generated that describes that component in more detail. Each component should be numbered, as should each subcomponent, and so on. So for example, a top level DFD would have components 1 2 3 4 5; the sub-component DFD of component 3 would have components 3.1, 3.2, 3.3, and 3.4; and the sub-sub-component DFD of component 3.2 would have components 3.2.1, 3.2.2, and 3.2.3



Context diagram (one only)



Level 1 DFD (one only, identifying up to seven / eight main processes)



Level 2 DFDs (two to seven / eight in number)



All key documents should be identified as data flows



Key processes or functions appear in Level 1 DFD



Where necessary, explode level 1 DFD into component level 2 DFDs ∗

Parent and child processes



Consider parent as a window onto its children

30.9. Creating DFDs ♦

Pencil and paper: often good enough, especially for a first draft



Can use MS Draw in Word, PowerPoint ∗



Very painful!

Drawing packages ∗

MS Visio Professional (a)

Comprehensive but expensive

(b)

In Visio, the DFD is called a data flow model diagram in the "software" category, or diagramme de modèle de flux de données in the “logiciels” category

(c)

SSADM support is no longer provided in the standard Visio product

Page 110 of 150

MS Visio is now available to ESC students via the Microsoft Developers’ Network Academic Alliance MSDNAA Electronic Licence Management System ELMS. You should by now have received an email from Microsoft’s agent e-Academy telling you how you can profit from this scheme. In order to create a drawing of a particular kind, you use both a template file and a stencil file. These together tell Visio what kind of symbols can be used. The equivalent terms in French are un modèle and un gabarit. For more information on Microsoft Visio and its use, please see appendix 4.



SmartDraw (www.smartdraw.com) (a)

Slightly less expensive

(b)

There’s also a free time-limited trial edition

This product has built in support for SSADM, and is significantly cheaper than Microsoft Visio. However, it is much less widely used, and there is much less third party support for this product. So although it is an excellent product and quite appropriate for the work you will do on this module, it is unlikely that you will be using it in your professional life.



EDraw (http://www.edrawsoft.com/) This new product also has good support for SSADM, and is significantly cheaper than Microsoft Visio. However, it is much less widely used, and there is much less third party support for this product. So although it is an excellent product and quite appropriate for the work you will do on this module, it is unlikely that you will be using it in your professional life.



Dia So far, the open source software available does not seem to me to be as good as it needs to be seriously to compete with Visio. Sadly, there is as yet no open source software that I can identify which is of sufficient quality to recommend in the area of drawing and computer aided software engineering. You might consider Dia, which is an open source diagramming package which can be used to create some analysis diagrams.



CASE tools: Computer Aided Software Engineering Professional systems analysts often work in a project environment in which they and other developers (programmers, etc.) use comprehensive packages called CASE tools to document an entire system development and implementation approach. The market leading CASE tool is IBM’s Rational Rose product. It is far too complex to be used by business professionals alone and unaided, although business specialists are a very important part of the overall project team (they, systems analysts, designers, programmers, etc.) which carries out large information systems projects. Page 111 of 150

30.10.First List the Elements of the Data Flow Diagram ♦

The Sources or Destinations of Data ∗



The Processes ∗



Where the data comes from or goes to (sometimes called External Entities) The processes that use or change the data

Documents: flows of data The documents used in or created by a process - for example, paper reports output by a system; paper forms used to “capture” (record) data for input into a system – are often a very good starting point for initial systems analysis. Sometimes, in complex systems, flows of data between processes are also distinct documents.



The Data Stores ∗

Repositories of data (can be card indexes, folders, computer files, documents)

30.11.Drawing the Context Diagram This level shows the overall context of the system and its operating environment and shows the whole system as just one process. It does not show data stores. A context diagram is a top level (sometimes also known as Level 0) data flow diagram. It only contains one process box that generalizes the function of the entire system in relationship to external entities.



Draw a box to represent the system under study and give it a name



Add the External Entities outside the system box



For each data flow, put directional arrows from and to the external entities to and from the system box



Label the data flows; typically these are documents, they may also be other kinds of message

30.12.Expanding a context diagram to give a level 1 DFD ♦

Draw a big box to contain the expanded diagram



Name it across the top



Add the data flows entering or leaving the box



Work back from a flow leaving the box to identify the subprocess (child process) which creates that output flow and add it to the diagram



Identify the data flows which are inputs to that process



Those data flows may be the direct outputs of other child processes; or more often, such another child process outputs to a data store which is then intermediate between the two child processes



Number the child processes; if there are more than seven or eight, it will be necessary to group together some Page 112 of 150

processes at this level, and create a Level Two diagram for that process

30.13.Questions to ask yourself ♦

For each process in the top level DFD: ∗

Does it need to read from or write to a data store?



Does it send data to or read it from another process?



Does the process have access to all the data it needs?



Does it need to be specified in more detail, using a Level 2 diagram, or is it simple enough to describe in (possibly, structured) English as an Elementary Process Description?

30.14.Rules for DFDs ♦

Process name has form imperative-verb followed by object (noun phrase), e.g. Enrol students ∗

Not “Enrolment” – which is a noun



All processes and data stores must somewhere have data going in to them and away from them



All data flows must start from or end with a process otherwise, what makes them happen?

30.15.Some points on logical DFDs ♦

Arrows show DATA flows - not the sequence of processes, nor physical flows



Sources and Destinations NEVER connect directly to a Data Store – always Processes



A Process must have at least one Data Input and one Data Output



Arrows not to or from data stores should be labelled with the data that is flowing

30.16.Supporting documentation ♦



Elementary Process Descriptions ∗

Description in natural language of a process which is not further exploded in lower-level diagrams



Shouldn’t need to be long – if it is, may indicate need for another level of diagram

The list of elements - External entity list, etc – which is typically stored in a Data Dictionary Page 113 of 150

30.17.Summary: “levelled” DFDs The output of the whole process of analysing processes (!) is sometimes referred to as a Levelled Data Flow Diagram. A Levelled DFD consists of:



One Context diagram



One Level one DFD



Multiple level two DFDs (two to seven/eight in number)



(Rarely) level three DFDs (two to seven/eight in number for each level two process which needs explosion)



Elementary Process Descriptions where necessary

Page 114 of 150

31.

Appendix 3 When to use a spreadsheet, and when to use a database 31.1. Introduction We normally store data on computers when we have many occurrences of a specific kind of record, and we want to process specific records, or complete set of records. For example, we may want to maintain a list of companies. For purposes of comparison, we will normally choose to store the same items of data about each occurrence. For example, we will store the name of each company, its principal sector of activity, and the address of its global headquarters. A widely accepted way of storing data, indeed, we may even refer to it as the “natural” way to store such data, is by means of two-dimensional tables. Many widely used office productivity programs provide good facilities for storing twodimensional tables. We can use a word processing program, such as Microsoft Word; a spreadsheet, such as Microsoft Excel; or a database, such as Microsoft Access. However, each program has specific strengths and weaknesses when it stores data in this way. Refer back to section 2.6 for more on this.

31.2. Spreadsheets versus databases 31.2.1.What spreadsheets are good at ∗

Spreadsheets combine conceptual simplicity, very powerful data manipulation and analysis facilities, and good information presentation facilities



Spreadsheets are easier to design and to use than are databases



It is comparatively easy to involve the use of a spreadsheet as the context of its use changes



Functions make it easy to use previously programmed data analytical techniques



It is possible to program new functions, or to have them written for you so that you can use a specific data analytical technique



Recent versions of Excel have excellent presentation facilities and they also connect very well to Word or PowerPoint

31.2.2.What databases are better at ∗

Spreadsheets are by their very nature highly insecure – anyone who can access a spreadsheet can see all the data in that spreadsheet; industrial strength databases make it impossible for users who are not privileged to see data to do so



Spreadsheets can rapidly become very complex, and it is very difficult to understand what the overall structure of the spreadsheet is; as a result, they can become a nightmare to maintain



It is difficult for more than a very small number of Page 115 of 150

people to use a single spreadsheet at one time, and almost impossible to stop them from interacting with each other, often in a conflicting way ∗

Spreadsheets can handle at the most a few thousand records; databases can handle millions



Databases can support tens, or even thousands, of simultaneous users

31.2.3.Using spreadsheets and database together Microsoft Office is an integrated suite of programmes. As a result, there are many different ways in which Microsoft Excel and Microsoft Access can be made to work together. In the most difficult cases, it will be necessary to programme this interchange using the Microsoft Office Automation feature. However, this is often not necessary. As stated in http://www.dummies.com/WileyCDA/DummiesArticle/id-2128.html, (checked 20/11/2008) “You can do plenty of importing and exporting data between Microsoft Office applications without writing any code at all. For example, you can perform the following actions: •

Import and export data by using options on the Access File menu.



E-mail Access objects, such as reports, by choosing Send To --> Mail Recipient.



Use the OfficeLinks feature to send objects to other programs.



Use basic Windows cut-and-paste techniques and OLE (Object Linking and Embedding) to copy and link data between programs.



Merge data from Access tables to Microsoft Word letters, labels, envelopes, or other reports, using the Word Mail Merge feature. (Search the Word Help system for merge.)

If you're just looking to get data from Access to another program (or vice versa), writing code is probably not the easiest approach. Any of the previous approaches are easier than writing custom VBA code to do the job.” If you wish to incorporate data stored in Microsoft Access and manipulate it in a Microsoft Excel spreadsheet, for example because you wish to create a monthly report with charts and graphics, you can create a query which extracts the data from several tables and then sends the results to Microsoft Excel in the form of an external data range. In the opposite direction, it is possible to create a list of data in Excel and make it available in Microsoft Access as though it were an external table. Or you can use a Microsoft Access form to enter data into Excel. For further information, see the online help facility in each of the two products.

31.2.4.Summary When you are manipulating data for yourself alone, or as part of a small team, or in a very small business, spreadsheets are likely to be more intuitive, initially more productive and easier to get started with. However, as the volumes of data, or the number of users, increase, databases become much the preferable option. It is often a sensible and viable option to prototype the requirements for a small business information system using a spreadsheet, and then, when the requirements are quite clear, to transfer the data storage element to a database. See also http://www.epinions.com/content_972857476 (checked 20/11/2008)

Page 116 of 150

31.3. What to do if your spreadsheet skills are weak Almost all business professionals need to be more or less able to use a spreadsheet. If you are not confident that you can use formulae in spreadsheets to carry out moderately complex data analysis, I suggest that you follow tutorials that you can find using Google. Here are a few suggestions – though they are unlikely to be the best ones: Spreadsheet – Microsoft’s own tutorials

Link to Excel tutorials

Spreadsheet – Excel 2003 English

http://www.yevol.com/excel2003/index.htm

Spreadsheet – Excel 2003 français

http://translate.google.com/translate?u=http%3A%2F%2Fwww.yevol.com %2Fexcel%2Findex.htm&sl=en&tl=fr&hl=en&ie=UTF-8

Spreadsheet – Excel 2007 English

http://www.yevol.com/excel/index.htm

Spreadsheet – Excel 2007 français

http://translate.google.com/translate?u=http%3A%2F%2Fwww.yevol.com %2Fexcel%2Findex.htm&sl=en&tl=fr&hl=en&ie=UTF-8

(http://office.microsoft.com/en-gb/training/CR100654561033.aspx)

31.4. What to do if your database skills are weak Work through some or all of an online tutorial guide – there are some excellent Access tutorials available on the World Wide Web. See what you can find for yourself using Google; here are some suggestions: Database – Microsoft’s own tutorials

Link to Access tutorials

Database – Access 2003 English

http://www.yevol.com/en/access2003/index.htm

Database – Access 2003 français

http://www.yevol.com/access2003/index.htm

Database – Access 2007 English

http://www.yevol.com/en/access/index.htm

Database – Access 2007 français

http://translate.google.com/translate?u=http%3A%2F%2Fwww.yevol.com %2Faccess%2Findex.htm&sl=en&tl=fr&hl=en&ie=UTF-8

(http://office.microsoft.com/en-gb/training/CR100654561033.aspx)

Alternatively and / or additionally, if you were here in your first year, revise the work that you carried out on Access in the second semester of the first year.

Page 117 of 150

31.5. Conclusion There is usually more than one way to skin a cat, as we say in English! There are certainly many ways of storing data. Each program has specific strengths and weaknesses. You are advised to consider very carefully who it is that requires what information, who collects the data, what kind of transformation is required between the input data and the output information, and then to choose the right program – or programs. For it is no coincidence that Microsoft, and other vendors, sell office productivity suites. A suite is a collection of programs, each program having specific strengths and weaknesses. Microsoft Office is just such a suite. As such, it provides many ways to store data using one program, and to link to that data in another. A very common scenario is that the data is stored in a database, while required output information is presented using the data analytical features of a spreadsheet. A good workman knows his or her tools, and chooses the appropriate tool for the job!

31.6. Acknowledgements – bibliography for Appendix 31 http://www.sjsoft.com/ Thanks to St James Software http://www.sjsoft.com/ for the original material on which I based this appendix. http://www.epinions.com/content_972857476

(checked 20/11/2008) Thanks to Epinions; see http://www.epinions.com/about/ for information about this organisation.

Page 118 of 150

1.

Appendix 4: Reasons why a database is to be preferred to a spreadsheet - Spreadsheet Does Not Equal Database This section was taken from http://www.pcmag.com/article2/0,2817,1435148,00.asp (checked 24/11/2008), an article which originally appeared in the United States edition of PC Magazine, dated February 3, 2004. This material is Copyright (c) 2004 Ziff Davis Media Inc. All Rights Reserved.





February 3, 2004



By Helen Bradley

Since the days of Lotus 1-2-3, people have used spreadsheet programs for everything from word processing to data management. Doing the former is silly. Doing the latter, however, is viable, especially in the latest version of Microsoft Excel. But though you may be more comfortable with Excel, a real relational database program like Microsoft Access is a better choice for managing data—for a number of reasons.



Databases are safer. Excel, for example, does everything in memory, so that any unsaved data may be lost if your system crashes. Databases write data to the hard drive immediately.



Databases can handle more data. Sure, Excel can technically handle more than 65,000 rows of data, but doing so will likely bog down even the fastest PC.



Databases can easily link tables of related data together, such as customers and orders or musical groups and albums (as well as the songs on each album). This is where the words relational and database come together. Storing related data together in a single table or spreadsheet can be unwieldy and invite errors.

We'll look at a situation for which Access is a better tool than Excel and show you how an Access solution works. If you've never used Access before, that's okay; we'll walk you through how to create everything from scratch. We used Access 2002 for the instructions, but you'll find the process is similar in all versions of Access. We chose Access because so many users have it already, but you can do the same things in other relational databases such as FileMaker or Microsoft SQL Server. For more on picking the right database, see "Databases for All Reasons" in our issue of January 2003 at http://www.pcmag.com/article2/0,1759,760886,00.asp (checked 24/11/2008).

1.1.

More Than a List Consider a veterinarian's office: To record pet and owner details, you could use a list in an Excel worksheet, but you'd encounter difficulties. If you create one record for each owner, how would you handle an owner with multiple pets? You could add a field for each pet, which would work for most clients. But a client who runs a breeding kennel with 25 cats and innumerable kittens would force your data record to grow to an excessive length. On the other hand, if you organize your data so you have one record per pet, you would have to enter the owner details for each pet in the household. This is unnecessarily repetitive. And if an owner changes his address, you would have to find and update all his pets' records individually. The better solution is to have two lists—one with the owner details and one with the pet details—and then link the two by including a field in both lists with a common Page 119 of 150

piece of information. For example, give each owner a unique code number, which you can then use in his pets' records. That way, you can find a pet, check the owner code, and then find the owner's details in the owner file. Likewise, you can look up an owner, find his code, and then extract all the pet records with that owner number. Although you have two lists, each owner and each pet has only one entry in the system. It's neat and efficient, and it solves another problem our veterinarian may encounter: When client breeders sell kittens to new owners, the new owners may become clients, too. To change a pet from one owner to another, simply change the owner code in the pet's record and if necessary add a new owner record.

1.2.

Create the Database To create the database requires two tables, one for owners and one for pets, with a field common to both—the owner code. We will set up the relationship between the two tables and add a form to make it easier to enter data. Each table needs a structure that includes a list of field names and types, as well as the sizes of the fields. Each table must also have a primary key—a field that contains a piece of information unique to that record. In the owner's table, the primary key is the owner code; in the pet table, we'll use a similar field called the pet code. We will use an AutoNumber field type for each. Access will then assign a unique sequential value to that field for each record. To build your database, launch Access, choose Blank Database from the task pane, and name your file PetHosp.mdb. Click on Create then double-click on the Create table in Design view option. The Table1:Table dialog will appear. Type Owner Nr as the first field name, then tab over to the next column and enter AutoNumber as the Data Type. (Access automatically completes the entry once you've typed the first letter.) Now enter the rest of the data as shown on the next page. Here are the fields and types: Field Name Data Type Owner Nr

AutoNumber

Surname

Text

First Name

Text

Title

Text

Address 1

Text

Address 2

Text

Town

Text

Postcode

Text

Date Created Date/Time If you want, you can add a description for each field to explain its contents as well as a caption. The caption is a name that is used in place of the field name in reports and forms. If you use shortened or cryptic field names, captions are a good idea. To set a primary key, right-click on the area to the left of the Owner Nr field and choose Primary Key. A key icon will appear, indicating that the field is the primary key. Save the file with the name Owner, and click on the table's Close button. Repeat this process to create a second table for pets with these fields: Field Name

Data Type

patient no

AutoNumber

patient name

Text

owner code

Number Page 120 of 150

animal type

Text

condition

Text

treatment date Date/Time leave date

Date/Time

date of birth

Date/Time

Set patient no as the primary key, name the table patients, and close it. Once you have created the tables, you can define the relationship between them. When you do this, Access helps you maintain your data integrity. For example, you can set up the relationship so that removing an owner automatically removes any of his pets from the patients table. Choose Tools | Relationships. When the Show Table dialog appears, click on the Owner table and then select Add. Do the same with the patients table and then click on Close. Small dialogs will appear, showing the structure of the two tables. Drag the ClientNo field from the Client table and drop it on the ClientNo field in the Pets table. When you let go of the mouse button, the Edit Relationships dialog appears with these two fields listed. Select the Enforce Referential Integrity check box and the Cascade Delete Related Records check box. This ensures that if an owner is removed, all his pets are removed, too. Click on Create to set up the relationship, which is one-to-many —one owner can have many pets (Figure 1). Click on the window's Close button and answer Yes when prompted to save the changes. Figure 1 Owner and Patients have a one-to-many relationship.

Now you can enter data into the tables. Click on Tables in the Objects bar and doubleclick on Clients to open it in datasheet view. Type the following data into the table (the number in the Owner Nr field will be entered automatically):

Page 121 of 150

Owner Nr

Surname

First Name

Title

Address 1

Address 2

Town

1

Brown

Joe

12/12/98 1 Blah St

2

Smith

Anne

2/2/2000 2 Blah Avenue Uptown

3

Green

Rick

5/5/2000 3 Blah Blvd

Downtown

Athens 12345

01/04/04

Athens 12345

12/04/04

Atlanta 56789

15/04/04

Trackside

Postcode

Date Created

Close the table and then repeat the process to add the following data to the Pet table (the patient no will be added automatically):

1.3.

patient owner animal no code type

patient name

condition

treatment date

leave date

date of birth

1

2

Cat

Peaches

fever

30/04/2004

01/05/2004

01/04/2003

2

1

Dog

Sam

01/04/2003

3

3

Horse

Dobbin

03/03/1999

4

3

Cat

Ginger

01/04/2003

Create a Data Entry Form Although you could continue to add data using the two tables separately, it's easier to use a form that displays all the related data. Access can do this for you. Close both tables and click on the Forms icon in the Objects bar and double-click on Create form by using wizard. From the Tables / Queries drop-down list choose Table:Owner and click on the double angle brackets (>>) to move all the Available Fields to the Selected Fields pane. Then choose Table:Patients and move only the animal type and patient name fields from the Available Fields to the Selected Fields pane. Then click on Next. Access will ask you, How do you want to view your data? Choose by owner and click on the Form with subform(s) option and then choose Next. When prompted, select Datasheet as the layout type for the subform and choose Next. Pick a style for your form (any will do) and click on Next. Type a form name, such as Owner and patient details, click on Open to view or enter information in the form and click on Finish to end. A form appears on the screen with the client data on top and the details of the pets belonging to the client in a table below (Figure 2). You'll see two sets of record navigation tools. The one at the bottom of the table is for the patients subform and the other is for the owner records. Click on the Next Record button for the Owner data and you will see that pets are displayed for that owner. Figure 2

Now you can add a new owner and his pet, as well as add a new patient to one of the existing clients. To see what is happening behind the scenes, close the form and open the patients table. You'll see that the data has been entered into the fields patient no and owner code, even though neither field was included on the form. The patient no Page 122 of 150

number is automatically entered, because the field type is AutoNumber and the owner code field is automatically set to the owner’s number, since the records are related through the form's design. Remove a client from the Clients table by opening the table, selecting the client, and clicking on Delete. You'll be warned that a record in another data file will be affected (the client's pets will be removed when the client is). This is the result of selecting the Cascade Delete Related Records check box when setting up the relationship. The same does not work in reverse and it is possible to have a client with no pets in the Clients table. ”

Page 123 of 150

2.

Appendix 5: Access Hints - Designing for Use The whole point of using Microsoft Access is to permit the safe, effective and efficient storage of data in tables so that information can be retrieved from them. We have seen that this involves analysing what data is to be stored in what tables. Designing a set of database tables which correctly reflects the structure of the data is, as we have seen, very important. However, almost as important is



To ensure that users of the database can get out the information they are looking for - this is done, in technical terms, using reports and forms and queries;



To enable users easily to put correct data into the underlying tables – this is done using forms and subforms.

In this section, we will suggest that the best way in which to get data into a database is to use forms and subforms which are based on the relational structure of the data. Very approximately, we will use forms and subforms which correspond to master and detail tables.

2.1. Getting more help The web is crucial to learning more about Microsoft Access and in particular for getting help with problems which you find too difficult to solve alone. There are many forums in which people help each other. There are many people who are very anxious to help by writing about how they have solved problems which probably seemed complicated to them when they first encountered them! A key to getting help is to formulate your Google query very carefully, adding just the right keywords. For example, when writing this appendix, I used the Google search:

I found material helpful to my writing in the first three sites which Google displayed! They are:



http://www.microsoft.com/communities/newsgrou ps/en-us/default.aspx? dg=microsoft.public.access.queries&tid=462cbebf -5bef-437b-88f6fbf70e774da0&cat=&lang=&cr=&sloc=&p=1



http://articles.techrepublic.com.com/510010878_11-5285168.html



http://www.accessprogrammers.co.uk/forums/showthread.php? t=170227

2.2. Unlocking the power of many-to-many relationships Page 124 of 150

The examples here are based on the following database structure:

In this database, the many-to-many between Student and Module Operation has been resolved by introducing link entity Module Registration. But how do we make it easy for the database user to input data into the database and to get it out again? The answer requires the use of forms and subforms that are based on one of the two “owning” tables and the link (junction, intersection) table. Using the nomenclature introduced on the diagram, we can refer to an A side of the many to many between Student and Module operation and a B side. Which we take as A and which as B is a matter of choice. You base the parent form on the A side of a relationship – here, on Module Operation. You base the subform on the junction table. You then use a combo box based on the table on the other side of the junction table, the B side – here, on Student. The combo box goes on the subform. In summary, to do data entry for a many to many relationship, use the two main tables and their junction table to create a form, subform and combo setup. More generally, when accommodating a many-to-many relationship via an associate table, you'll need to base forms on a query which combines the fields from the various tables involved. Make sure that this query includes all the non-key fields you may want to modify or may need from both the many and the one table, and, if necessary, from other tables as well. It is also essential that the foreign key that represents the one side from the associate table be included.

Page 125 of 150

Consider the following example. Assume that you are the module leader responsible for the operation of a module in a given year and that you want to manage students for whom you are responsible. The steps you should follow are these:



You will create a form with subform based on a query, which in this case I have called Module Operation Registration. (I’ve included all the main fields from Module, Module Operation, Module Registration and Students because I use the same query in several forms.) Using the nomenclature introduced above, Module Operation is on the A side (as is its parent entity, Module).



Create a form – subform - subform based on the query (here for Module, Module Operation and Module Registration as combined in the query Module Operation Registration). Ensure that, as a minimum, the fields displayed include all the elements of all the primary keys of the tables.



Turn the foreign key corresponding to the primary key of the B side of the many-tomany into a combo box. Here, turn the Student no field in the subform into a combo box. That combo's rowsource needs to independently query the Students table so as to return the possible values of Student no. When you choose a value it is inserted as the bound field. Your combo box therefore needs to include the student number, but also a concatenation of first and last names of the Students, so you have some meaningful data from which to select. This will enable you to choose an existing student from the list (or even enter a new one, although this is probably undesirable). To accomplish this task, open the completed form in Design view and change the foreign key field's bound control to a combo box. (Right–click the control, choose Change To, and then select Combo Box.) Set the combo box control's Row Source property to an Page 126 of 150

appropriate SQL statement11: SELECT DISTINCT [Module Operation Registration].[Student no], [Module Operation Registration].[Student surname], [Module Operation Registration].[Student forenames] FROM [Module Operation Registration] ORDER BY [Student surname], [Student forenames]; ♦

11

In addition, set the Column Count property to 3 (so that the three fields in the SELECT statement will be displayed). Return to Form view and display the control's drop-down list.

You don’t need to create this SQL statement yourself. Instead, separately create a query that combines the fields that you need in the usual way, using Design mode (mode création). Test that it works, then display it in SQL mode. Copy the SQL SELECT statement that Access has generated and use it to replace the SELECT statement in the Row Source mentioned above. Page 127 of 150

2.3. Some difficulties associated with forms and subforms and how to overcome them The examples here are based on the following database structure:

2.4. Subform not updated A common problem is that when you scroll through records in the main form, the subform is not updated. The records on the main form and the subform are not synchronized. This is because the subform is not always automatically linked to the main form. When you create a subform or subreport by dragging a form or report from the Database window onto another form or report or by using the Form Wizard, Microsoft Access does automatically set the LinkChildFields and LinkMasterFields properties, but only under one of the following conditions:



Both the main form or report and the child object are based on tables, and a relationship between those tables has been defined with the Relationships command. Microsoft Access uses the fields that relate the two tables as the linking fields.



The main form or report is based on a table with a primary key, and the subform or subreport is based on a table or query that contains a field with the same name and the same or a compatible data type as the primary key. Microsoft Access uses the primary key from the main object's underlying table and the identically named field from the child object's underlying table or query as the linking fields. Page 128 of 150

Recalculation occurs automatically for controls that reference other fields on the same form or fields in subforms. Recalculation does not occur automatically for subform controls that only reference fields on the main (master) form or in other subforms. This is because subforms notify the main form of any changes, but the master form does not notify the subforms of changes. Nor do subforms on the same main form notify one another of any changes. Otherwise, it is necessary to set these properties explicitly. A common situation is where it is necessary to incorporate an existing form as a subform to a newly-established one. Because setting the properties changes the definition of the existing form, it is wise to take a new copy of the existing form and to work with that, rather than the original. SOLUTION: Setting LinkChildFields, LinkMasterFields Properties explicitly You should use the LinkMasterFields and LinkChildFields subform control properties to link the main form and subform automatically. You can manually update the subform by pressing the F9 (recalculate) key. You can use the LinkChildFields and LinkMasterFields properties together to specify how Microsoft Access links records in a form or report to records in a subform, subreport, or embedded object, such as a chart. If these properties are set, Microsoft Access automatically updates the related record in the subform when you change to a new record in a main form. You can set the LinkChildFields and LinkMasterFields properties for the subform, subreport, or embedded object as follows:



The LinkChildFields property. Enter the name of one or more linking fields in the subform, subreport, or embedded object.



The LinkMasterFields property. Enter the name of one or more linking fields or controls in the main form or report.

You can use the Subform/Subreport Field Linker to set these properties by clicking the Build button to the right of the property box in the property sheet. You can use the name of a control (including the name of a calculated control) to set the LinkMasterFields property, but you can't use the name of a control to set the LinkChildFields property. If you want to use a calculated value as the link for a subform, subreport, or embedded object, define a calculated field in the child object's underlying query and set the LinkChildFields property to the field. When you specify more than one field or control name for these property settings, you must enter the same number of fields or controls for each property setting and separate the names with a semicolon (;). Page 129 of 150

Note The linking fields don't have to be included in the main object or in the child object. As long as they are contained in the objects' underlying tables or queries, you can use the fields to link the objects. When you use a wizard, Microsoft Access automatically includes the linking fields.

2.5. Detail subform does not show the subset of records based on the value of the current master form record Unless you take specific action, a detail subform does not show the subset of records based on the value of the current master form record when that master form record changes. SOLUTION Solving this problem requires both SQL and some simple VBA, which deals with certain events. Event programming is a very powerful tool that you can use within your VBA code to monitor user actions, take appropriate action when a user does something, or monitor the state of the application as it changes. An Event is an action initiated either by user action or by other VBA code. An Event Procedure is a Sub procedure that you write, according to the specification of the event, which is called automatically by Access when an event of that particular type occurs. A form frmComboTest is defined:

This is a test form which shows how the ProductID combo box displays values based on the CategoryID selected. The content, the RowSource for the ProductID, is obtained using the SQL statement: SELECT distinct Products.ProductID, Products.ProductName FROM Products WHERE (((Products.CategoryID)=[forms]![frmComboTest]! Page 130 of 150

[CategoryID])) UNION select distinct null, null FROM Products ORDER BY Products.ProductName; The ProductID combo box is requeried on both the OnCurrent event for the Form as well as the Change event for the CategoryID combo box. The code required for the OnCurrent event procedure is: Private Sub Form_Current() ProductID.Requery End Sub The code required for the Change event on the parent (master) is: Private Sub CategoryID_Change() ProductID.Value = Null ProductID.Requery End Sub Pay close attention to the RowSource for the ProductID combo box. The RowSource is an SQL statement. It is based on a UNION query with the appropriate Product table records as well as a row that contains null values. When the CategoryID combo box is changed, the ProductID combo box receives the null value. This is how the contents of the ProductID combo box are cleared. Unfortunately, the SQL used as the RowSource has to be written by you, the user – this particular kind of SQL statement cannot automatically be generated on the basis of a userdefined query. Note The syntax for referring to objects, such as forms and controls, is not completely straightforward. Use either of the following syntax statements to reference a control on a main form: Forms!formname!controlname Me!controlname (In more recent versions, you can substitute dot (.) for exclamation mark (!) between objects.) One of the most common mistakes made in Access form development is improper syntax when referencing controls on a subform. As far as Access is concerned, a subform is just another control on the main form. To refer to a subform or a control on a subform, you must remember that Access treats the subform as a control. Essentially, you have a form with a control with a control. To express that arrangement in terms Access can decipher, you need the Form property as follows Forms!mainform!subform.Form.controlonsubform Me!subform.Form.controlonsubform In other words, subform is simply a control on the main form. Page 131 of 150

In the example given above, frmComboTest is a form. If it is included as a form on another form (for example, mainform), that is, if it is a subform of mainform, then the SQL SELECT statement needs to be amended: SELECT distinct Products.ProductID, Products.ProductName FROM Products WHERE (((Products.CategoryID)=[forms]![mainform]! [frmComboTest]![CategoryID])) UNION select distinct null, null FROM Products ORDER BY Products.ProductName; If the full syntax is not respected, Access returns an error when the field on the form is used.

Page 132 of 150

3.

Appendix 6: Normalisation 3.1.

Introduction to Normalisation With thanks to my former colleague at Huddersfield, Steve Wade, on whose material much of this section is based! I have also referred to the book by Graham Curtis (Curtis & Cobham 2008). Normalisation is a bottom-up technique for relational data analysis based on analysing inter-relationships between data items. From our point of view, this is just an alternative way to establish the entity types, their attributes, and their relationships. We will use it primarily as a way of crosschecking that we have found all the relevant entities, attributes and relationships. Curtis (2008) says: “Normalisation results in a fine tuning of the entity model. It may lead to more entities and relationships being defined if the entity model does not contain entities in the simplest form. The analyst is moving away from considering a high-level logical model of the organisation to the detailed analysis of data and its impact on that model. Doing this ensures that data is organised in such a way that (1) updating a piece of data generally requires its update in only one place, and (2) deletion of a specified piece of data does not lead to the unintended loss of other data.” The end product of the technique is a set of entities designed to minimise redundancy of data and to avoid consistency problems.

3.2.

Introduction ♦

The relational database has a mathematical basis in Set Theory



It is possible to exploit the mathematical basis for relational database design to improve the quality of the actual design. Normalisation is a formal technique for ensuring that the right attributes appear on the right entities



Also called relational data analysis, the technique of normalisation is based on a property of data called dependency or functional dependency.



Normalisation aims to yield a set of entities designed to





Minimise data redundancy



Avoid consistency problems

Normalisation is a “Bottom up” technique Instead of starting with a top-down analysis of user requirements, this technique starts with the existing situation: the technique examines business documents as they are currently used in existing business processes. From this, it induces the necessary database entities. For example, the starting point might be an existing purchase order form. As we saw Page 133 of 150

above in section 2, normalisation enables us to deduce the need for several entity types, including purchase order, supplier, product and order detail line.



3.3.

It is applied to attributes discovered on paper and computer forms, viewed as a table (cf. Spreadsheet view)

Preliminary remarks An Entity name takes the form of a Singular noun (or occasionally noun phrase), e.g. Student, Module, Module registration.



Attributes should be singular and represent a single fact about an entity.



They MUST NOT be lists (= more than one fact); an Awards attribute for a student is WRONG



Each attribute should depend upon the whole key (an issue only if the key is compound)



If an attribute depends on part only of the key, this is WRONG



Each attribute should depend only on the key



If an attribute depends on any other non-key attribute, this is WRONG



What do I mean by wrong? If you design a database which does not respect the rules which follow, it is likely that you will store duplicate, and therefore potentially inconsistent, data; or that other inconsistencies will develop, especially when you delete entity occurrences.

3.4.

Terminology 3.4.1.Records Data tends to be held in groups of items - each individual item of data is a field, and the group of fields constitutes a record.

3.4.2.Field names Or attributes.

3.4.3.Keys ♦

Introduction Before we can store details of (facts about) things in a database, we need unique labels, that is, identifiers or names, for the entities about which attributes are to be stored. These identifiers, or keys, need to be chosen with precision and consistency. Candidate keys are possible labels / names / identifiers.

Page 134 of 150

Where there is more than one candidate key, we need to choose one as the primary key. Usually we choose numeric / short keys. Often, we deliberately create a unique key (perhaps intended to be computer-generated), such as a student enrolment number



Candidate keys There may be more than one possible key in a given situation. For example, we might identify an Employee by her payroll number, or by her National Insurance (NI) number in the UK, or Social Security (US) number.



Choose numeric / short keys This may imply encoding, e.g. BAIB for BA (Honours) International Business.



Often need to create a unique key (perhaps computer-generated) Microsoft Access offers the AutoNumber facility to assist in generating unique keys.



Key types Keys may be:



Simple: single attribute



Secondary - identifies a group of linked occurrences This document does not discuss secondary keys.





Compound (a)

This means the key is made up of more than one attribute

(b)

Each attribute is often a single key in another relation

Candidate key An attribute or combination of attributes is a candidate key if it uniquely identifies a record. We have to choose to make one the Primary key, and leave others as Alternate keys (i.e. candidate keys which have not been chosen as the Primary key).



Foreign keys Foreign keys implement one to many (1:M) relationships in the following way. If two entity types are related 1:M, then the primary key attribute(s) (or, rarely, the alternate key attribute(s)) of the one entity MUST appear as attribute(s) of the many entity. This is because Page 135 of 150

this is the only way in which the database software can “join” the many records to the one. Consider a situation in which students are on a programme. The entity types are Programme and Student, related 1:M. If the primary key of Programme is Programme_Code, then Student must also have a Programme_Code attribute.



Questions and Answers Question: How many primary keys must an entity have? Answer: one Question: How many foreign keys does an entity have? Answer: potentially several – one per 1:M relationship in which the entity is at the many end



Functional Dependency This is a fundamental concept, initially a bit difficult to grasp. Consider an entity E that has two attributes A and B. The attribute B of the entity is functionally dependent on the attribute A if and only if for each value of A no more than one value of B is associated. In other words, the value of attribute A uniquely determines the value of B and if there were several entity occurrences that had the same value of A then all these entity occurrences will have an identical value of attribute B. A and B need not be single attributes. They could be any subsets of the attributes of an entity E (possibly single attributes). We may then write E.A -> E.B This can be read as A determines B (though this is not strictly correct), or that B depends on A (which is true).



Dependency made simple(r!) An attribute B is functionally dependent on A if for any particular example of B, it is the value for that one particular A and there can be no other value. Example: healthy sound animals have a number of legs which is true for all animals of that type. Knowing an animal is a dog, we know it ought to have four legs. Number of legs is functionally dependent upon animal type. Page 136 of 150

3.5.

The various stages of normalisation 3.5.1.Convert data into unnormalised form (UNF, 0NF) List out all the data attributes you can identify. I find it useful to record them on a spreadsheet or in a word processor running in outline mode - using either of these tools, it is easy to reorder data as you realise that particular attributes are part of another entity from the one you first thought.

3.5.2.Convert UNF into First Normal Form (1NF) ♦

1NF: The rule



There must be only one value per cell (row / column intersection) in the entity viewed as a table. That is, an attribute must be a single value, and not a list.



Identify Groups: data field(s) (one or many) that can have multiple values for the single main key



You should remove repeating groups ('remove' means set up as a separate entity)



Key to the new entity will be a compound key comprising the original key plus additional information needed to uniquely identify individual occurrences

3.5.3. Convert 1NF into Second normal form (2NF) ♦

Rule for 2NF: A 1NF entity is also in 2NF if every non-key attribute depends on the whole of the key



Avoids duplication, which is inefficient and leads to update problems



For each entity, determine whether the key is compound



For entities which have a compound key, ask: "Are there any non-key attributes which depend only on part of the key?"



If there are: remove them (i.e. set up a new entity)

3.5.4. Convert 2NF into Third normal form (3NF) ♦

For each entity, ask: "Are there any non-key attributes dependent only on any other non-key attributes?"



If there are: remove them (i.e. set up a new entity)



Rule for 3NF: A 2NF entity is also in 3NF if no non-key attribute depends on any other non- key attribute



Avoids duplication, which is inefficient and leads to update problems



Gives us somewhere to store data when there is no (in this case) order Page 137 of 150

3.6.

Further normalisation Fourth normal, Boyce-Codd normal and further normal forms have been identified in the database literature but circumstances in which they are needed are so rare as to be of little practical significance.

3.7.

A full example of normalisation

We shall consider how each of these steps might be carried out on the following document:

Purchase Order placed by Entrepôt Direct, Lille

Date 01/05/2004

Purchase order number

1234567

Supplier number

1

Supplier name

Aardvark

Supplier address

23bis rue du Flâneur 35000 RENNES

Supplier product code

Product Name

Quantity required

Packet size

Purchase price

A1

Digital Radio

10

1

160,00

1600,00

A2

Whiteboard

16

1

120,00

1920,00

Pre-tax total

3520,00

VAT rate

19,6%

VAT

689,92

Total remitted

3.7.1.

Sub-total

4209,92

Step 1 - Convert data into UNF

This involves representing the data in the Purchase Order in the following format: Purchase order number Supplier number Supplier Name Supplier Address Supplier product code Product Name Quantity required Packet size Purchase price This UNF representation indicates that:  "Purchase order number" is the key data item.  "Supplier number, Name and Address" occur once per order.  The remaining items are repeated a number of times as a group. They are indented above – this emphasises the repetition.

Page 138 of 150

3.7.2.

Step 2 - Convert data into 1NF

Remove repeating groups, i.e. groups of data fields (or a single data field) that may have multiple values for a single value of the key. Set such groups up as a separate entity. The key to this new entity will be a compound key comprising the original key plus additional information to identify individual occurrences. Applying this to the above example gives us the following: First Normal Form Purchase order number Supplier number Supplier Name Supplier Address

Purchase order number Supplier product code Product Name Quantity required Packet size Purchase price

We now have two entities, purchase order and purchase order detail. The rule for 1NF is therefore: There must be only one value per cell (row/column intersection) in the entity. Put in another way, an entity is in first normal form (1NF) if there are no repeating groups of attributes.

3.7.3.

Step 3 - Convert data into 2NF

For each entity we must ask:  Does the entity have a compound key?  If it does, then we must ask: Are there any non-key attributes which depend on only a part of the key? Rule for 2NF A 1NF entity is also in 2NF if every non-key attribute depends on the whole of the key. Any attributes that are dependent on only a part of the key should be removed and stored in their own entity along with the part-key on which they depend. Applying this rule to our example leads us to produce the following representation: Second Normal Form Purchase order number Supplier number Supplier Name

Purchase order number Part No Quantity Required Page 139 of 150

Supplier Address

Purchase price

Part Number Product Name Packet size What was wrong with 1NF, and what have we gained by moving to 2NF? The answer is that the 1NF representation contains unnecessary repetition of "Part Description" and "Packet size" information for every part ordered. The same part may be ordered many hundreds of times so that storing the data in 1NF could represent a waste of disk space. More importantly, this amount of redundancy in the way data is stored could lead to significant update problems. Another problem with our 1NF representation is that there is nowhere in the database to store information about Parts which are not currently on order. So to summarise, by normalising, we have discovered a third entity type, which is going to be called something like Product, or Stock item. To avoid the possibility of the database becoming inconsistent (with some copies of the same data being updated whilst other copies are overlooked) we would ideally like to store each piece of data only once. This is really what normalisation is all about.

3.7.4.

Step 4 - Convert data into 3NF

For each 2NF entity must ask:  Are any of the non-key attributes dependent on any other non-key attributes? So that we can enforce the following rule: Rule for 3NF A 2NF entity is also in 3NF if no non-key attribute depends on any other non-key attribute. Applying this rule to our example would give the following representation: Third Normal Form PURCHASE ORDER Purchase order number Supplier number PRODUCT Supplier product code Product Name Packet size

PURCHASE ORDER LINE Purchase order number Supplier product code Quantity required Purchase price

SUPPLIER Supplier number Supplier Name Supplier Address

Note: The items in CAPITALS above are suggested names for the entities now identified Again we should ask the question: "What's wrong with 2NF"? Page 140 of 150

Our 2NF representation included unnecessary repetition of "Supplier Name" and "Supplier Address" for every purchase order associated with the same supplier. This corresponds to the first problem that we discussed with regard to 1NF. The second problem corresponds to the fact that in the 2NF representation there is nowhere to store information about suppliers from whom nothing is currently on order. So normalisation here has identified the existence of a supplier entity. There is another possible reason for interdependency of attributes. This is that one attribute is calculated from others. It is very wise not to make such calculated attributes part of the database structure. Instead, it is better simply to remove them, and to create them as calculated fields on queries, reports or forms as they are needed. In this example, it would be unwise to have a subtotal attribute calculated as quantity required times purchase price. Instead, as the subtotal is needed – e.g. on a report or form – it should normally be recalculated by a formula. The exception to this advice is where it genuinely is necessary to store the value calculated at one point in time, typically for accounting reasons.

3.8.

Normalisation: A Summary ♦

First normal form: No repeating attributes or groups of attributes



Second normal form: Remove dependencies on part (only) of the key



Third normal form: Remove inter-dependencies



Most important aspects of normalisation

Generally, the 3NF representation is the ideal we should strive for. There are higher normal forms for dealing with anomalous situations. They are explained in the database literature, but rarely have any practical significance. The steps involved in normalisation may be summarised as follows: 1. Separate repeating groups 2. Separate partial-key dependencies 3. Separate non-key dependencies



ESSENTIAL rule: No repeating attributes (no plurals) (1NF) If you break this essential rule, you will end up with a completely unusable database design.



GOLDEN rule: “An entity is fully normalised if every nonkey attribute depends upon the key, the whole key and nothing but the key.” Even if you do not carry out all the stages of normalisation, this can be a very useful final check to apply to the results of any database design work.

3.9.

Normalisation complements top-down entity-relationship modelling Page 141 of 150

Once you have completed normalisation, you should compare the results with those achieved by the top-down modelling carried out in accordance with Chen's ER model and resolve any inconsistencies.

3.10. What is achieved by normalisation? Normalisation of entity types leads to a data model that forms the basis of a good database design because it: •

Decomposes entity types into their simple “atomic constituents”, that is, their basic parts



Ensures that data is not unnecessarily repeated



Allows data on entities to be independent of the existence of other entities

It is also important to realise that the data that was associated with the original unnormalised entity is still recoverable. The entities are connected at the entity level by their key attributes. Having normalised each of the entity types in the model, it is possible to recombine the entity types in order to answer specific questions. This is done in Microsoft Access by means of queries which join together more than one table. •

So a further advantage of normalisation is that tables which are in third normal form can be recombined so as to answer almost any conceivable question.

3.11. How is normalisation used in practice? Very few database designers use normalisation as the only way in which they build data models. Instead, normalisation is used in one of the two following ways: The data model is built in two ways, using top-down entity modelling (section 24), and by bottom-up normalisation. The results are then compared and any inconsistencies resolved by specific design decisions. The data model created by top-down means is checked to ensure that all the entities are in third normal format. This can either e done by using the three rules outlined above for each stage of normalisation, or simply by applying the so-called “golden rule” of normalisation.

3.12. Still confused? See if Microsoft’s explanation at http://support.microsoft.com/default.aspx?scid=kb;ENUS;283878 (checked 24/11/2008) is any help. There’s also an associated webcast.

3.13. Some questions with which to check your understanding The answers to these questions are NOT in this document. However, if you do them, you may approach the author of this document to discuss them! Page 142 of 150

1.

Identify the duplicated data in the following table:

Emp. No.

Emp. Name

Job Code

Job Title

Start Date

Finish Date

123 349

Smith Cairns

A1 P3

3/2/2000 3/2/2001

5/2/2001 3/9/2001

541

McPhee

P3

5/2/2001

7/4/2003

123 541 123 632 123

Smith McPhee Smith Keith Smith

A2 P4 A3 A2 M1

Trainee Analyst Senior Programmer Senior Programmer Analyst Chief Programmer Senior Analyst Analyst Project Manager

6/2/2001 8/4/2003 8/5/2003 9/4/2004 11/10/2003

7/5/2003

2.

10/10/2003

Given the following table are the following statements true or false? a. b. c. d.

A customer can have more than one salesperson A salesperson can have more than one customer A discount code is associated with only one discount percentage. The total sales to date is determined solely by customer no.

S.Perso n No.

S.Person Name

Cust. No.

Cust. Name

Cust. Disc. Code

Cust. Disc %

Total Sales to Date

002

Kellerman

014

Adams

027

Jennings

257 286 295 317 352 463 494 295

Jones Brown Foster Green Tate Young Peacock Foster

A B B C A O C B

10 15 15 20 10 0 20 15

2272 189 23652 24272 5734 5734 4153 2253

i)

Choose a key field for the above table.

ii) Remove the repeating groups from the above table to produce 1NF tables. iii) Perform 2NF analysis on the tables identified in (ii) iv) Perform 3NF analysis on the tables identified in (iii) v) Represent the tables in (iv) as an ERM. 3. Consider a retail company that stores sales information. The information is currently stored in a single file. The company has several stores (shops) and the file has a record for every product line on sale at each store. The file also contains details of future price changes and the effective date for which these have been scheduled. The file therefore has the following structure: Store

Item

Descriptio n

Annual Sales to Date

Page 143 of 150

Price

Effective Date

10 10

AB13 CF99

10

CF99

10 20

HK76 AB13

Towel Handkerchi ef Handkerchi ef Jeans Towel

1100.00 350.00

1.15 0.38

30/06/03 30/06/03

*

0.47

31/10/03

26.99 *

30/06/03 *

1700.00 840.00

The entries marked * have values that have already been entered, i.e. * represents 'ditto'. (I) What problems might result from storing this data in a single table? (II) Take the data in the file through to third normal form. (III) Does the new file structure address all the problems identified in (I)? (IV) If the sales manager wanted to add the following data to the files: - Supplier Name for each item - Name of store manager for each store - Maximum quantity of each item to be stored in each store Where would the data be stored in your 3NF model? Would any new tables be required? 4. The following table shows the breakdown of student marks on different courses by assignment number. In this example we have a repeating group inside a repeating group. For each course there is repeating student data and for each student there is repeating assignment data. Cours e Code

Course Title

Student Code

Student Name

Ass. No.

Ass. Subject

Ass. Mark

SA

Systems Analysis

A1234

Wade

1

Dataflow Diagrams

75

2

Entity Relationship Models

67

3

Normalisation

25

1

Dataflow Diagrams

60

2

Entity Relationship Models

54

3

Normalisation

32

1

SQL

65

A1235

DB

Databas e Design

A1230

Walker

Smith

Page 144 of 150

A1234

i)

Wade

2

Prototype Implementation

54

1

SQL

55

2

Prototype Implementation

64

Take the data in this report through to 3NF. What are the benefits of storing this data in third normal form?

5. The Natural Yoghurt Company sells many products. Each product is composed of several raw ingredients that are supplied by various vendors. A particular ingredient is always supplied by the same vendor; however a vendor may supply more than one ingredient. The product line (product offering) is divided up so that only one department is responsible for a particular product. However, each department is responsible for more than one product. Each manager manages exactly one department. The following data items must be stored in the Natural Yoghurt Company’s database: Product Number Manager’s name

EmployeeID of manager Ingredient Number

Ingredient name

Department name

Dept office address Vendor name

Dept. phone number Vendor address

Product name Department Identification No. Qty of ingredient required for product Vendor ID

Derive an entity relationship model and a set of 3NF tables from the above description.

Page 145 of 150

4.

Appendix 7 Installing and using Microsoft Visio 4.1.

Introduction MS Visio is now available to ESC students via the Microsoft Developers’ Network Academic Alliance MSDNAA Electronic Licence Management System ELMS. You should by now have received an email from e-academy telling you how you can profit from this scheme. In order to create a drawing of a particular kind, you use both a template file and a stencil file. These together tell Visio what kind of symbols can be used. The equivalent terms in French are un modèle and un gabarit. Microsoft Office Visio 2007 makes it easy for business and ICT professionals to visualise, explore, and communicate complex information. Rather than complicated text and tables that are hard to understand, you can use Visio diagrams that communicate information at a glance. Instead of static pictures, you can create dataconnected Visio diagrams that display data, are easy to refresh, and dramatically increase your productivity. You can use the wide variety of diagrams in Office Visio 2007 to understand, act on, and share information about organizational systems, resources, and processes throughout an enterprise. Office Visio 2007 is available in two stand-alone editions: Office Visio Professional and Office Visio Standard. Office Visio Standard 2007 has the same basic functionality as Visio Professional 2007 and includes a subset of its features and templates. Office Visio Professional 2007 offers advanced functionality, such as data connectivity and visualization features, that Office Visio Standard 2007 does not.

4.2.

Visualize complex information to better understand it Office Visio 2007 provides a wide range of templates — business process flowcharts, network diagrams, workflow diagrams, database models, and software diagrams — you can use to visualize and streamline business processes, track projects and resources, chart organizations, map networks, diagram building sites, and optimize systems. You can more easily visualize processes, systems, and complex information using new or improved features in Office Visio 2007:



Get started quickly with templates. Office Visio 2007 includes specific tools to support the diverse diagramming needs of IT and business professionals. Create a broader range of diagrams with new templates, such as the ITIL (Information Technology Infrastructure Library) template and the Value Stream Mapping template in Office Visio Professional 2007. Use the predefined Microsoft SmartShapes symbols and powerful search capabilities to locate the right shape, whether it is saved on a computer or on the Web.



Quickly access templates you use often. In the Getting Started window, find the template you need by browsing simplified template categories and using large template previews. Locate the templates you used recently by using the new Recent Templates view in the Getting Started window.



Get inspired by sample diagrams. Find new sample diagrams more easily by opening the Getting Started window and using the Samples category in Office Visio Professional 2007. View sample diagrams that are integrated with data to get ideas for creating your own diagrams, to realize how data Page 146 of 150

provides more context for many diagram types, and to determine which template you want to use.



Connect shapes without drawing connectors. The AutoConnect functionality in Office Visio 2007 connects shapes, distributes them evenly, and aligns them for you — with only one click. When you move the connected shapes, they stay connected and the connectors automatically reroute between the shapes

4.3.

Learning Visio It’s worth investing a little (but not much?) effort into mastering Visio. The standard Microsoft tutorials can be found at http://office.microsoft.com/enus/visio/CH102262071033.aspx

4.4.

Creating DFDs using Visio In Visio, the DFD is called a data flow model diagram in the "software" category, or diagramme de modèle de flux de données in the “logiciels” category. Here you will find various DFD conventions – one widely used in America is Gane Sarson.

4.5.

Installing SSADM support In this document, I have followed the SSADM convention (particularly fir DFDs). The standard Visio product no longer contains support for SSADM shapes. I can make available a template on request. To use them with an English-language or French-language version of Visio, please follow the following instructions. I am indebted to the Danish specialist Pavel Hruby, whose web site is to be found at http://www.phruby.com/index.html (checked 24/11/2008) for the basic information on which I based this approach. Typically, Visio 2002 keeps stencils and templates in the folder C:\Program Files\Microsoft Office\Visio10\1033\Solutions\Software in the English-language version of the product, and in the folder C:\Program Files\Microsoft Office\Visio10\1033\Solutions\Logiciel in the French language version of the product. I believe that Visio 2003 uses the folder C:\Program Files\Microsoft Office\Visio11\1036\ in both the English-language and French language versions of the product. I believe that Visio 2007 uses the folder C:\Program Files\Microsoft Office\Office12\1036\ in both the English-language and French language versions of the product. It is not very wise to put your additional files in these Microsoft folders. I advise you instead to keep them in a different folder, and to tell Visio where to find them. So:



In English-language Visio: ∗

Download the files to any folder, except for the folder in which Visio 2003 stores its own stencils and templates.



Start Visio, click "Tools" and "Options". In the "Advanced" tab, click "File Paths..." and type in Page 147 of 150

the fields "Stencils:" and "Templates:" the paths to the directory with the SSADM stencil and template. Restart Visio. The template "SSADM" will appear under the "(Other)" tab, not under the Software tab as it was in certain earlier versions of Visio. ♦

In French-language Visio: ∗

Download the files to any folder, except for the folder in which Visio 2003 stores its own stencils and templates.



Start Visio, click "Outils" and "Options". In the "Options avancés" tab, click "Chemins d’accès..." and type in the fields "Modèles:" and "Gabarits:" the paths to the directory with the SSADM stencil and template. Restart Visio. The template "SSADM" will appear under the "(Autres)" tab.

Page 148 of 150

5.

Appendix 8 Structured Walkthroughs, a way to improve the quality of analysis 5.1.

How to seek for perfection! Improving the quality of our work This section borrows heavily from Bell, Douglas (2005), himself quoting Weinberg, Gerald M. (1998). Some aspects of systems analysis require considerable precision. In effect, we are looking for the correct answer to a precise problem. This is a way of thinking which is alien to most of us most of the time. Human beings are used to imprecision! But in analysis and in programming, we seek zero-defect, bug-free implementations.12 However, in this real world, this is impossible to achieve and very difficult to approach. Once we have created something, such as an entity relationship diagram, it is wise to assume that it will contain faults. However, we can become very blind to our own mistakes. For this reason, it is a common experience that someone else can spot errors better than the author himself. This observation led to the invention of the so-called "structured walk-through". Credit for its invention belongs to Gerry Weinberg, in his book "The psychology of computer programming". Weinberg suggested that programmers see their programs as an extension of themselves. He suggested that we get very involved with our own creations and tend to regard them as manifestations of our own thoughts. Since we are unable to find fault with ourselves, we become unable to see mistakes in what we have created; the recognition of such failings is unacceptable to us. This failing is sometimes referred to as cognitive dissonance. The solution, as Douglas Bell says in his book "Software engineering for students: a programming approach", is to seek help with fault finding. In doing this we will relinquish our private relationship with our work. When applied to computer programming, the approach is sometimes called ego-less programming. But it has much wider application, specifically to anything we create which needs to be more-orless correct. Seeking for help can be a completely informal technique, carried out by colleagues in a friendly manner. It must not be a formalised or rigid procedure of the organisation. Such formalisation destroys its ethos and therefore its effectiveness. Instead, if you get a friend or colleague to inspect your work, it is extraordinary to witness how quickly someone else can see a fault that has been defeating you for hours. Studies also show that different people tend to uncover different types of fault. This further reinforces the need for team techniques. An extension to this approach is the so-called “structured walkthrough”. A structured walk-through is simply a term for on organised meeting at which an artefact is examined by a group of colleagues. The aim of the meeting is to try to find faults which might otherwise go undetected for some time. The word structured in this context simply means well organised. The term walk-through means that the producer of the artefact has to explain to the meeting



Step by step the working of the artefact

and ♦

All the assumptions behind that artefact.

The very act of explaining the artefact, and of course of letting other people look at that artefact, will enable errors or problems to be detected much more quickly. It is important that the meeting concern itself only with the identification of problems or

12

Please note : the zero-defect ideal is emphatically not expected in the work that you do for assessment. Instead, we are aiming for “good enough”! This appendix is included only because of the extremely useful technique it illustrates. Page 149 of 150

serious errors of style. The designer of the artefact should correct the problems subsequently Keys to success in the use of structured walk-through include:  Correctly assembling the right group of colleagues.  Distributing the artefact to participants before the meeting.  Total concentration on the artefact itself, rather than the person -- individual criticism should be avoided.  The meeting should be scheduled in advance and of fixed duration. The benefits of structured walk-throughs can be summarised as:  The quality of the artefact is improved because more faults are found, and because errors of style -- which can lead subsequently to errors of interpretation by others -- are eliminated.  Misunderstandings of the original requirements are more likely to be detected.  The earlier a problem is found with an artefact, the cheaper it will be to fix it. But there are obvious problems in using this technique in an organisational culture that is not collaborative and supportive.

5.2.

References for Appendix 8 Bell, Douglas (2005) "Software Engineering for Students (4 ed)" Pearson Mar 2005, Paperback, 448 pages ISBN13: 9780321261274 ISBN10: 0321261275 Weinberg, Gerald M. (1998) “The Psychology of Computer Programming: Silver Anniversary Edition” (Paperback) Gerald M. Weinberg (Author) Dorset House; Anl Sub edition (September 1998) ISBN-10: 0932633420 ISBN-13: 9780932633422

Page 150 of 150

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF