Explain different database design strategies.
1 Answer

Distributed Design

  • Design of distributed programs involves where to place data and programs.
    • Therefore, we could talk about the design of where application programs are placed and where DBMSS are placed.
    • However, that is not of interest to us.
  • We are really interested in organizing data
    • How to partition the data
    • Where to place data partitions
  • The object is to make data access fast and efficient
    • Through locality of reference
    • Place the data that users will use most often closest to them

Three dimensions

  • The organization of distributed systems can be investigated along three dimensions :
    • Level of sharing
    • Behavior of access patterns
    • Level of knowledge on access pattern

Level of sharing

  • No (program or data) sharing
    • Not really done in sophisticated data environments
  • Data sharing only

    • Programs are replicated where necessary
  • Program and data sharing

    • Programs and data are not replicated
  • We will examine architectures that support the last two "level of sharing" options

Access Patterns

  • To understand which users need which data, one must understand user and application access patterns

    • What type of data do which types of users need?
    • Where are the users located?
  • Static access patterns

    • Not very usual
    • Straightforward to design and manage distributed data environment
  • Dynamic access patterns

    • More likely - users do not always have the same needs over time
    • More difficult to anticipate
    • Difficult to design and manage distributed data environment \
  • Our approach is to address static access patterns only

  • The static approach can serve as a basis for more complex dynamic approaches

Level of Knowledge on Access Pattern

  • How much do we know about how users will access the data?
  • Again, knowledge of access patterns is a range

    • No knowledge (hard to know how to distribute data)
    • Partial knowledge
    • Complete knowledge (helps us determine ideal placement of data)
  • Partial knowledge - to some extent - is more usual case

    • We have to do the best job initially
    • Will have to observe usage patterns over time to get a better idea of data access patterns
  • These issues contribute to the design and placement of distributed data

Distribution Design

  • Top-down

    • mostly in designing systems from scratch

    • mostly in homogeneous systems

  • Bottom-up

    • when the databases already exist at a number of sites

Top Down Design

  • Conceptual design of the data is the ER model of the whole enterprise

    • Must anticipate new views/usages -
    • Must describe semantics of the data as used in the domain/enterprise
  • This is almost identical to typical DB design

    • However, we are concerned with Distribution Design
    • We need to place tables "geographically" on the network
    • We also need to fragment tables

Bottom Up

  • Top-down design is the choice when you have the liberty of starting from scratch

    • Unfortunately, this is not usually the case
    • Some element of bottom-up design is more common
  • Bottom-up design is integrating independent/semi-independent schemas into a Global Conceptual Schema (GCS)

    • Must deal with schema mapping issues
    • May deal with heterogeneous integration issues
Please log in to add an answer.