The TurboIMAGE/XL FAQ Hosted by Beechglen Development Inc www.beechglen.com Last Updated: Wednesday, November 25, 1998 This is the TurboIMAGE/XL FAQ. It is meant to be a guide for beginning programmers and data base administrators. NOTE; THIS IS A WORK IN PROGRESS any input, updates, or corrections are most welcome, please send them to: imagefaq@beechglen.com Another terrific on-line resource for the HP3000 is the HP3000 FAQ. It can be located at http://www.3k.com/index_publications.html Any questions or comments should be sent to rcb@3k.com. An indispensable "On-line Encyclopedia for HP Users" can be found at http://www.robelle.com/library/smugbook/ The TurboIMAGE/XL Database Management System Reference Manual can be found at http://docs.hp.com:80/dynaweb/smpe/b1019/b840/@Generic__BookView TABLE OF CONTENTS 1.0 How do I get started? (Introduction) 2.0 What are some of the fundamentals of data management? 2.1 Placement of data on computers 2.2 Data management laws 2.3 History of data management 2.4 How files are accessed 2.5 Data base technologies 3.0 What is the Image Data Base Nomenclature? 3.1 Image terms and vocabulary 3.2 Types and relation of files 3.3 Record formats 3.4 Major subsystems 4.0 How do I define and process a Schema? 4.1 The data base program 4.2 Capacities 4.3 DBschema 4.4 DButil 5.0 What are the Data Base Access Intrinsics? 5.1 Parameters and data types 5.2 DBopen & DBclose 5.3 DBlock & DBunlock 5.4 DBerror & DBexplain 5.5 DBput & DBdelete 5.6 DBget & DBfind 5.7 DBupdate & DBcontrol 5.8 (Dbbegin,DBxbegin) & (Dbend, DBxend) 5.9 DBxundo 5.9 DBinfo 6.0 What are the Utilities and Tools? 6.1 DBSTORE & DBRESTORE 6.2 DBLOAD & DBUNLOAD 6.3 Capacity, structure change, and diagnostic utilities 6.4 Logging and recovery 6.5 DBDRIVER 6.6 MPEX 6.7 DBLOADNG/Howmessy 7.0 How do I begin the design process? 7.1 Defining the data elements 7.2 Building hierarchical structures 7.3 Folding hierarchies 7.4 Prototyping 7.5 Walk Through 8.0 How do I improve on an existing design? 8.1 Common design situations 8.2 Moving data about the shell game 8.3 Critical design assumptions ------------------------------------------------------------------- 1.0 How do I get started? (Introduction) Most people initially learn Image by a slow process of osmosis from their current application environment. Thus the success of the initial learning process usually depends on the skill and consciousness of the people that created and maintain this environment. The development, maintenance, and access of Image data bases can be viewed from many perspectives, the programming aspects, the system management aspects, Operating System aspects, and the user access aspects. This could also be stated as the push-pull relationship between those who create data bases and those who must live with them and keep them working on a day-to-day basis. There are obvious "trade offs" and gray areas between these various types of activities. On one hand, the push is to get the application developed and turned over into production as soon as possible, and on the other hand the priority is to get as many users on the system as possible while providing reasonable response times. It is therefore extremely important that both perspectives understand the limitations that Image places on the other. This is the only way to achieve a successful, cost effective balance between development and production. One way I've heard this simply stated is: "Image" is a do-it-yourself brain surgery kit, you could make yourself as smart or as dumb as you want. It all depends on what you know." Fred White, ADAGER. Specific to Image, there are many features that can be abused by developers that can prove to be a nightmare for those who have to perform maintenance. On the other hand there are many maintenance utilities and subsystems that can be very useful to developers and thus speed up implementation times. Actually to most people, be it applications or system management orientated, Image is a myth, and data bases past and present are legends. The first myth is shown in the three names that Image goes by. Image, TurboIMAGE, and TurboIMAGE/XL. Image was the original name, Turbo Image was a major rewrite of Image that was prompted by the larger number of users supported on the HP3000 Series 6x's, and 70's. And, Turbo Image/XL is a third generation rewrite of Image prompted by the migration to the MPE/XL versions of the HP3000's. From a development and system management perspective, these different versions of Image are very compatible. So, throughout this FAQ only the differences will be pointed out and they will be collectively referred to as simply Image. Another aspect of the Image myth is that it becomes all things to all people. Like any data base management application, Image has its strong points and its weak points. The latter sections of this FAQ describe some of the ways that application and data base designers can get into trouble attempting to overcome Image's weaker points. A final comment on the Image myth is that many system managers and programmers are very superstitious about their data bases. They feel that Image is built of many black boxes, mirrors, magic dust, and secret incantations the details of which are known only to wizards and high priests (HP). This FAQ methodically breaks down these black boxes explaining the components, sweeps out the magic dust, and teaches you the appropriate time and places to invoke the secret incantations. And, this will, of course, make you an Image wizard in the process. ------------------------------------------------------------------- 2.0 What are some of the fundamentals of data management? 2.1 Placement of data on computers The key reason that data is placed on computers is for the ability of the computer to retrieve, sort and summarize. One of the most important aspects of any scheme for sorting, retrieving and reporting is that data should only be sorted on a computer if it has some need to be recalled at a later date, and all data stored should be time stamped so that at some future date, its usefulness can be determined. Data that is stored in any fashion that is not likely to be retrieved and is not obsolete over time will add to the overhead and expense of the more heavily accessed data. 2.1.1 removed 2.1.2 The value of information as a corporate asset The value of information as an asset to the company is dependent on three factors: capacity, reliability and security. Capacity is the capability of the computer to meet the information demand. Information is only valuable if it can provide cost savings or business improvements on a timely basis. Reliability is the quality measurement for a computer's ability to meet information demands on a consistent basis. And security is the attribute which measures the ability to grant access to information on a need to know basis. 2.2 Data Management Laws There are certain rules of thumb, or laws, that can be described in general terms and apply to many situations, including data management. The following are my top four "laws" of data management. I hope you find them amusing and/or insightful. 2.2.1 To err is human, to really foul things up requires a data base. All methods for data storage and retrieval are prone to error situations. We simply haven't got the technology to the point where it is infallible, and certainly evolution hasn't done much to alleviate the problems of human error. The point is that all data base systems must be designed with technical and human failures in mind. To this end the KIS (keep it simple) principle is the only weapon, because the failure rate of any system increases directly with its level of complexity. 2.2.2 Separating keys and pointers from data for faster retrieval The cornerstone of information storage and our information society is the index file. We use them every day, in hundreds of different ways, woven into every day life. My favorite example is the card catalog at the library. This index allows us to search on three possible values. author, title, and subject to quickly determine if a book or books exist and their relative location in the library. The alternative method of scanning each book on each shelf would be many times more time consuming than using the card catalog. The card catalog is not free and has its drawbacks. Cards must be added and removed as books are added and removed from the library, and the books must be arranged in a specific order on the shelves. If you ever want to get on the bad side of a librarian, just put a book back on the wrong shelf. In this case the work expended in maintaining the card catalog and the location of the books is justified by the need to speedily find and retrieve books of interest. Computerized data management schemes use the same principle of separating the index information from the data to allow faster access to the specific records of interest. While computerization allows multiple indices into the data, it should be noted that the marginal value of additional indices drops significantly after two or three, while the overhead to maintain the index is usually linear. 2.2.3 It's when you do the work that counts The decision of which items to index on and how to maintain the data is highly application dependent. If an application's primary function is to produce batch reports, then building in indices that are maintained as each record is added or deleted is probably a waste. On the other hand, an application is required to retrieve transaction information based on a variety of possible key values, then it probably makes sense to maintain indices for each of the possible key values. 2.2.4 There is no such thing as a free lunch All throughout the following discussions of general data management and specific HP3000 and Image issues we will find many trade offs of functionality for a given expense. Generally, the more functionality built into an application, the greater the expense in terms of computer resources required. The only magic trick is to match the functionality of the application with the capabilities of the system and data base software. 2.3 History of Data Management The only way to completely understand the state of the art in data management, we must first discuss the evolutionary history from the past to the present. This will, hopefully, clear up some of the seemingly confusing terms and vocabulary. 2.3.1 Punched card files The first computers used punched cards to store and retrieve information. There were two ways to retrieve information, depending upon whether the file of cards was sorted or not. If the card file wasn't sorted then to find a specific card, the entire file would have to be read to find each card of interest. If however, the card was sorted on the item of interest, then we could scan the file until we passed the point where the cards of interest would have been if present. The sorting of the cards really paid off in the ability to summarize and report for different occurrences of the key value. 2.3.2 Tape files The next innovation in data management was the use of magnetic tape to store greater volumes of information and allow much faster scanning of individual records. The tapes still did not allow indexing, as you still had to read from the beginning of the file going forward until you selected the records of interest. 2.3.3 Disc files After magnetic tape, the next innovation was to spread the Magnetic material on discs and allow the read write head to move over the data. This finally allowed the ability to read records in any sequence. A good analogy would be a tape player versus a compact disc. On your tape player you must listen to each song in sequence on the tape (unless you manually rewind), but on your compact disc you can choose which songs and in which sequence you would like them to play. This capability is called random access. The next song or record played does not have to be physically adjacent to the previous. 2.3.4 Random access Random access capability finally allowed computers to take advantage of indices. As in the library example, key files were built that had pointers to the location on the disc where the data record of interest could be found. These pointers are then used to directly read the desired record. The only aspect which is random is the fact that the record could be anywhere on the disc. The time to read a given record is made up of three components. the time to locate the read/write head over the data, the time to wait for the disc to spin so the data record of interest is under the head, and the time to transfer the data from the magnetic media to the computer. The major portion of the total time is the time to locate the head over the data. 2.3.5 RAM and ROM discs As computers have become increasingly more powerful, the demand for disc IO speeds has also grown. This is because all computers from mini's to mainframes wait for disc I/o's at the same speed. One way to increase disc IO speed is to have more disc drives. Another way is to store the most active data in the RAM (random access memory) memory of the computer and only write it out to disc when the record is changed. This is required because the in memory version of the record would be lost in the event of a power failure, while the disc version of the record will be maintained until it is overwritten. ROM (read only memory) discs are becoming available that store huge quantities of information on a read only basis. These discs have many indices built in for fast retrieval of data. 2.4 How files are Accessed Now that we've discussed a little of the history, we can get down to the specific methods used by Image to access data. It is very important to understand these access methods as they form the very foundation for any further discussion. 2.4.1 Serial access Serial access is the reading of a file from front to back, or back to front. The key connotation is that the file is not sorted in any fashion, so that all records must be read to determine if the specific ones of interest are found. 2.4.2 Sequential access Sequential access is the reading of a file from front to back or from back to front, but the file is specifically sorted so that the search can be discontinued when the point at which the records should have been has past. 2.4.3 Directed access Directed access is the ability to read a specific record number in the file, without having to read any other preceding records. In and of itself, this isn't very useful unless the key value is also the relative record number, or directed access is used in conjunction with calculated or chained access. The case of the key value being the relative record number can be exemplified by having the customer number be the record number in the file, thus to retrieve a customer record we could directly read the record at the customer number location in the file with one read. 2.4.4 Chained access Chained access is simply the storing of relative record numbers with the data as pointers to other logically linked records. For example, in our customer record which we accessed by customer number, we could store the relative record number of the first order record in the orders file and in each order record store the relative record number of the next order record for that customer, thus forming a linked list or chain of entries for each customer. 2.4.5 Calculated access The simple case for the index file is that the key is also the relative record number. This, while simple to implement, leaves something to be desired when selecting possible key elements. What if we wanted to key on something other than a number, i.e. customer name. What if the number we wanted to key on was too large of a range for the disc space available, i.e. social security number. What is required is a calculation for generating a relative record number for these keys such that the result is between 1 and the capacity of the file. This process is called hashing and its purpose is to allow direct reading of the index file record which is of interest. +-----------------------------------------------------+ Data | | | | | | | | | | EOF Records | A00 | Z33 | B01 | U23 | M91 | P00 | F55 | T07 | A00 | Mark +-----------------------------------------------------+ Relative 0 1 2 3 4 5 6 ..... n Record ^ Number Direct Access Read Record #6 Serial >--^------^-----^-----^-----^-----^-----^-----^-----^-- Access Find Record With ID # = A00 Must Read Each Record 2.5 Data Base Technologies Different data base technologies are no more than the packaging of a standardized method of accessing data. The package will only be successful if it meets the tests for capacity, reliability and security. There are many different data base management packages on the market today. They can be classified into three general categories called models. hierarchical, relational and network. These models refer to the fashion in which indexing is used to link different logically related files via pointers and chains. 2.5.1 Hierarchical models In the hierarchical models of data bases, the files are linked from top to bottom, left to right, to form a single master logical record. This forms a tree structure, and there is a single master file that is called the root to which all of the remaining files are related either directly or indirectly. In this model all the indices or relationships of the files in the tree structure are fixed and cannot be changed without unloading all of the data, modifying the data base definition and reloading the data. Hierarchical models lend themselves to very large transaction processing and batch reporting applications. 2.5.2 Relational models In relational data base models the indexing or relation of one file to another is more flexible. Indexes can be built and deleted on an as required basis. This is done at the user level by linking two or more files by a common field. This can be a very powerful tool for application environments where the amount of data is small and the information requirements are very fluid. 2.5.3 Network models The network model is similar to the hierarchical model except there can be more than one root file. Image is a two level network model. The first level of files are called master data sets, and the second level of files are called detail data sets. A data set is the Image term for a file. The master data set serves as an index into up to sixteen detail data sets, and the detail data set can have indices from to one to sixteen master sets. ------------------------------------------------------------------- 3.0 What is the Image Data Base Nomenclature? 3.1 Image terms and vocabulary In order to appear to have some unique corner on data base technologies, the marketers of various data base software will invent a new set of acronyms and vocabulary to make the old and common place seem new and worth paying for. So, each data base package seems to have its own jargon, and it may help to equate these image terms with some other ones you may be more familiar with. Capacity - The number of records or entries a data set may contain. Chain - A set of logically related records in a detail data set which are linked together by pointers to the previous and next entry. Chain Head - The pointers in the master data set that point to the first and last entry in the related detail that contain the logically related entries. Data Entry - A record in the data set made up of data items or fields. Data Item - The named field of a given data type. Data Base - The set of related files or data sets. Hashing - The calculation that accepts a key value as input and generates a relative record number between 1 and the capacity of the index file. Path - The physical relationship between the master data set and the linked records of a detail data set. Relative Record Number - The number which represents the distance from the first record in the file to the current record in count sequence. Search Item - The data item whose content is used to logically form a path from the master to the linked list of detail records. 3.2 Types and relation of files 3.2.1 Manual masters Image master data sets are index files which can optionally hold data items. The master file can only have one key value and this entry must be unique. This is because key access to this file is via a calculated access based on a hashing algorithm. There are two kinds of master data sets: manual and Automatic (kind of like cars). Manual masters can have additional data items besides the search item. The name "manual" comes from the fact that these data entries must be written and deleted by an application program. Manual masters work well if there are a few small fields and there will never be a need to access the records on one of the non-search item fields. The non search item access would require a serial search of the entire master data set. 3.2.2 Automatic masters The Automatic master cannot have any non-search item fields as the data entries are added and deleted as required by Image to serve as index records for detail data sets. Automatic masters are easily added and deleted from the network design using any one of the Image utility packages. Both Automatic and manual masters can contain chain heads, pointers to logically related detail data set records. This is the real power of Image as it allows the ability to make subsets of entries in the associated detail data sets. This chain head holds the relative record numbers and count for the first and last entries of the linked list of detail records. 3.2.3 Detail data sets Detail data sets are used to hold records which are accessed via the linked lists forming chains or in front to back or back to front serial fashion. 3.2.4 Master to detail relationships The value of the path from a master to a detail data set depends entirely on the average number of detail records tied to a master entry. This ratio can range from 0 to the number of entries in the detail data set. If the average is greater than ten percent of the number of detail entries, then the key value chosen is a waste because it doesn't make a small enough subset to warrant chasing the chain. It would be easier to serially search each detail. A frequent example is plant code in a company with two or three plants. If an inventory detail with 999,991 entries, the average plant chain length would be 500,000 to 333,333, not a very useful subset. On the other hand, if the average is less than one, it is probably a design with a small detail tied to a large master. The large master is probably linked to a separate large detail. This is a waste because the master in question must have pointers to the small detail that it can never use. 3.3 Record Formats The actual file record for an image data set is made up of a bit map, pointer information, and at least one data item. The records are blocked together based on a parameter that specifies the maximum block size, and Image takes care of fitting as many records as it can in that block and wasting any leftover space. Logical Record Format of an Image Master Data Set |Chain Head for Path 1| |Chain Head for Path n| | Data Part of Entry | +---------------------+ +---------------------+ +---------------------+ |CHAIN | FIRST | LAST | |CHAIN | FIRST | LAST | |KEY | DATA ITEMS FOR| |COUNT | ENTRY | ENTRY| |COUNT | ENTRY | ENTRY| |ITEM | MANUAL MASTER| +---------------------+ +---------------------+ +---------------------+ | 3 | 3 | 1 | | 50 | 22 | 21 | |TEST | TEST DATA | +---------------------+ +---------------------+ +---------------------+ | | | | | | +-------------------------------------------+ | | | Detail Data Set | | | +-------------------------------------------+ | | | Forward | Backward | Key | Data Items | | | | Pointer | Pointer | Item | | | | +-------------------------------------------+ | +--> 1 | 0 | 2 | TEST | | | +-------------------------------------------+ | 2 | 1 | 3 | TEST | | | +-------------------------------------------+ +---------->3 | 2 | 0 | TEST | | +-------------------------------------------+ Physical Block Format of an Image Data Set +----------------------+ +---------------------+ +---------------------+ |MPE/V File Label Block| | 1st Physical Block | | Nth Physical Block | +----------------------+ +---------------------+ +---------------------+ |MPE | USER | EMPTY | |BIT | MEDIA | MEDIA | |BIT | MEDIA | MEDIA | |FILE | LABEL | TO | |MAP | RECORD | RECORD| |MAP | RECORD | RECORD| |LABEL | | END | | | 1 | N | | | | | +----------------------+ +---------------------+ +---------------------+ +------+ +-----------------------------+ +---------------------+ |MPE/XL| | 1st Physical Block | | Nth Physical Block | +------+ +-----------------------------+ +---------------------+ |FILE | |USER | BIT | MEDIA | MEDIA | |BIT | MEDIA | MEDIA | |LABEL | |LABEL | MAP | RECORD | RECORD| |MAP | RECORD | RECORD| |TABLE | | | | 1 | N | | | | | +------+ +-----------------------------+ +---------------------+ 3.3.1 Bit maps At the front of each block, Image builds a bit map to represent the availability of the records in the block. This saves time in accessing the individual entries of the block to see if it is occupied. These maps play an important role when the master or detail is serially searched, in that especially the case of masters we may read many empty blocks before one is found that contains a record. Bit Map Format of an Image Data Set +----------------------------------------------------------------+ | Bit Map | ENTRY 1| ENTRY 2| ENTRY 3| ENTRY 4| ENTRY 5| ENTRY 6| +----------------------------------------------------------------+ | 11001000 | MEDIA | MEDIA | EMPTY | EMPTY | MEDIA | EMPTY | | 00000000 | RECORD | RECORD | RECORD | RECORD | RECORD | RECORD | +----------------------------------------------------------------+ 3.3.2 Pointer information Remember that pointers are simply stored relative record numbers that form the physical structure of the data base. The pointers in master records form a chain head consisting of a first entry and a last entry pointer and a total entry counter. This chain head occupies six words (12 bytes) for each chain head or path for a total up to 16. In each master record there is also a chain for entries whose search item hashes to the same relative record number. When this "collision" happens, the entries are chained together on a secondary chain. In the detail each path has a set of forward and backward pointers which allows the ability to read, add to, and delete entries in the chain. The chain pointers in the detail take up four words (8 bytes) for each path for a total of up to 16 paths. 3.3.3 Data content The data content of each entry is made up of the data items specified for the data set. The maximum data entry size is 2048 words (4096 bytes). 3.3.4 Blocking factors The blocking of records is done by specifying a maximum block size in the data base definition, and Image works backwards to fit the optimum number of records per block as to not waste disc space. The actual record is made up of the bitmap plus the pointers plus the actual data items, and is called a media record. The default block max is 512 words (1024 bytes or four sectors) and should be changed to 2560 words (5120 bytes or 20 sectors). This will allow Image to pick the most optimum value. 3.4 Major Subsystems Image is a complete data base package. It comes with the utilities for building and installing the data base, subprograms for accessing the data base programmatically called intrinsics, utilities and tools for maintenance and recovery, and a report writer called QUERY/3000. 3.4.1 Implementation overview In the beginning, the programmer designs reports and interactive screens that will make up the particular application or system desired. The data items from the report and screen layouts are defined in terms of type and relationship to each other. The data base is designed and coded into the form of a data base definition called a schema. This schema is processed by a compiler into a set of files which form the data base. The application programmer can now build programs that access the data base via subprograms called intrinsics. During the debugging process, QUERY could be used to load test data and to interactively check the results of programs under test. The various data base utilities and tools are used to configure and maintain the data base while in production. 3.4.2 Schema definition and processing The schema is developed either by manually programming it using EDIT/3000, QUAD or some other text editor, or generated automatically from a data dictionary. Either way the data base program is compiled and the resulting object module is then used to build the final data base. 3.4.3 Data base access intrinsics The data base access intrinsics are the only way to access the data base. All utilities and applications must use this set of subprograms or privileged mode. Accessing an Image data base in privileged mode is not supported by Hewlett Packard and is best left up to the experts. These intrinsics provide the common access point for controlling security and reliability of the data contained within. 3.4.4 Utilities and tools There are many data base utility tools to facilitate maintenance of Image data bases. Later chapters will go into the HP supplied and more popular contributed library and purchasable utilities. 3.4.5 QUERY\3000 Query is a non-programming tool for entering test data, modifying limited amounts of data online, and producing simple reports. Query is also useful for displaying the current number of entries and the structure and content of data bases. ------------------------------------------------------------------- 4.0 How do I define and process a Schema? 4.1 The data base program The data base is specified using a simple program called a schema. This program is a simple free format MPE file which is input to a compiler DBschema.PUB.SYS. The schema and the output from the compiler provide important documentation regarding the installation of the data base. While the data base programming language is simple and free format, the tricky part is the use of periods, commas, and semi-colons which are used as syntax dividers. 4.2 Begin A throwback to the SPL programming language, the begin statement is used to start the schema and name the data base. Example: BEGIN DATA BASE TEST; The data base name can be up to six alphanumeric characters, the first character must be alphabetic. The individual data sets will be identified by appending a two digit numeric counter i.e. 01, 02, 03... so it usually is not a good idea to name your data base DB01, because the name of the first set will be DB0101, the second set DB0102 ..., this can be very confusing. The data base name is really DBNAME.group.account, so before the data base is built, consider what group and account it should reside in. This can save a lot of trouble and effort down the road. 4.3 Passwords Passwords are used to screen read and write access at the data set and data item. It is usually sufficient to code passwords at the set level only. Passwords at the item level add processing overhead, and this type of security is better dealt with in the application program. The Image password can be up to eight ASCII, characters, but the first character must be alphabetic. The use of lower case or non-printable characters can raise security but make it very difficult to input passwords interactively. Hard coding passwords into programs makes them very difficult to change, reading the current passwords in from a control parameter file allows them to be altered on a periodic basis without recompiling programs. Example: PASSWORDS: 10 READER; 20 WRITER; ^ ^ | | | + - Password to be supplied at data base open | +----- Arbitrary class number from 1 to 63 >> 4.3.1 User class table Each password maps to an index number from 1 to 63, this allows for up to 63 different passwords, the numbering convention used is up to the data base designer. The number is then used as shorthand notation to specify which passwords allow access to data items or data sets. A class number of 0 is assigned to an open request that specifies a password not on the password list. This class can be assigned to data items and sets to allow access even with a bad or missing password. A class number of 64 is assigned to an open request that specified a ";" and the user is signed on as the creator of the data base. The creator is simply the user who ran DBschema and DButil to build the data base. This class turns off all security and allows full access to all items and sets. 4.4 Data item definition The items section of the schema program defines each field that will exist in the data base. These fields are used later in the definition of the individual files or data sets that make up the data base. An example of the items section of the schema: ITEMS: CUSTOMER-NAME, X40 (10/20); << CHANGE TO 40 ON 1/2/88 >> ADDRESS-LINE-1, X40 (10/20); ADDRESS-LINE-2, X40 (10/20); ADDRESS-LINE-3, X40 (10/20); CITY, X40 (10/20); STATE, X2 (10/20); ZIP, X10 (10/20); ^ ^ ^ | | | | | +- Read/Write Class List | +- Type and Sub-Item Length +- Data Item Name >> 4.4.1 Item name constructs The item name can be up to 16 alphanumeric characters, but the first character must be alphabetic. The item names are used in a request list at data access time and must match exactly as defined in the schema. Thus as a practical concern, items should be named in upper case and be long enough to describe the field but extra prefixes and suffixes should be avoided. The maximum number of data items per data base is 1,023. 4.4.2 Item definition parameters The actual storage of the item is defined using three parameters, the sub-item count, the type designator and the sub-item length. The sub-item count is an array specification, it defaults to 1. Arrays are a simple method of storing related items i.e. monthly expenses etc, however most report writers do not handle sub-item counts other than the default of 1. Some of the older data bases were forced to make extensive use of sub-item arrays due to an old limitation of 255 data items per data base. The type designator is used to tell Image how to store the field within the record. Note each field must be defined to be aligned on a 16 bit boundary. 16 bit item types: I - Signed binary integer in 2's complement J - Same as I but allows Query to edit input for COBOL COMP format K - Absolute binary value R - An HP formatted floating point number 8 bit item types (must be used in multiples of two) U - ASCII data but no lower case alphabetic characters allowed X - Any ASCII data Z - Zoned decimal format 4 bit item types (must be specified in multiples of four) P - Packed decimal number Item Length COBOL FORTRAN SPL Type Bytes I 2 S9 -S9(4) COMP INTEGER INTEGER I2 4 S9(5) -S9(9) COMP INTEGER*4 DOUBLE INTEGER I4 8 S9(10)-S9(18) COMP J 2 S9 -S9(4) COMP J2 4 S9(5) -S9(9) COMP J4 8 S9(10)-S9(18) COMP K1 2 LOGICAL LOGICAL R2 4 REAL REAL R4 8 DOUBLE LONG Un n Display PIC A CHARACTER BYTE Xn n Display PIC X CHARACTER BYTE Zn n Display PIC 9 CHARACTER Pn n/2 COMP-3 NUMERIC 4.4.3 Read/write class list As mentioned above, the user class numbers are used to screen access at the data item and data set levels. This is accomplished by specification of a read/write class list at the data item and set level. The security logic works in the following fashion: At data base open the class number is mapped from the password. If the creator password is used at DBopen, the no read write screening is processed. If the password used at DBopen is not in the password list, then a class of 0 is assigned. Otherwise, the class number is simply mapped from the password provided. At data set access, the set read/write class list is checked in the following fashion. If the class is on the write side of the list specification the user has full access to all items in the set. If the class is on the read side of the list specification, this simply means that we will be allowed to process the item level security screen to see if we have read or write access to each individual item. This adds additional overhead for each data set access. If the class is not on the set read/write class list then no access is allowed to that set. At data item access, the read/write class list is processed to allow read or write access to each individual data item, and is only processed if the class number was on the read side of the class list at the set level. If the class is on the read class list for the item, then read access is granted. If the class is on the write class list then read and write access is granted. And if the class is not on the class list, then no access to the item is allowed. 4.4.4 Comments Documentation in the form of comments can appear anywhere in the data base schema simply be using << to start the comment and >> to end it. These comments can span multiple lines, and any text in between is ignored by the compiler. Comments at the item level can be useful in documenting the use of the data item. Comments can also be used to document changes made to the data base over time. 4.4.5 Sorting of items table In data bases with a large number of data items it can be very helpful if the item names appear in sorted sequence. This can save a great deal of time when referring to the schema for the item type and size in order to build data definitions for programs. The items can be sorted by using editor to specify the data items then using sort to arrange them in alphabetic sequence, and then adding the rest of the schema definition. 4.5 Sets The data set is made up of up to 255 individual data items. There are two kinds of data sets. Master key data sets and Detail Data sets. The data base can have from one to 191 data sets. The data set name can be from one to 16 alphanumeric characters, but the first character must be alphabetic. 4.5.1 Master Data Set Definition Master data sets are accessed by hashing a key value to generate a relative record number. The Image master can have only one key item. There are two kinds of Image master data sets, manual and automatic. The manual master is used if data must be stored with the key record, and the data entry must be manually written to the data base. The Automatic master cannot contain any data items other than the key data item, and is automatically added or deleted as required. This will become clearer when detail data sets are discussed. An example of a manual master data set schema definition: SETS: << data set name. type class list device class >> NAME: CUSTOMER-MASTER, MANUAL (10/20), DISC1; ENTRY: CUSTOMER-NAME(1), <<-- key item and path count >> ADDRESS-LINE-1. ADDRESS-LINE-2, ADDRESS-LINE-3, CITY, STATE; <<-- note the ; to end the list of items >> CAPACITY: 100001,101,101; << maximum capacity,[initial capacity [,increment]] >> An example of a manual master data set schema definition: SETS: << data set name. type class list device class >> NAME: ORDER-NO-MASTER, AUTOMATIC (10/20), DISC1; ENTRY: ORDER-NO(1); <<-- key item and path count >> CAPACITY: 101; The read/write class list on the data set level is specified similarly to the item level, as previously discussed, write access at the set level gives full access to all data items in the data set. Read access at the set level means that you must examine the item level access security to see if your class has read, write, or no access to each individual data item. The device class parameter on the data set allows the creator to specify which disc device class the data set is to reside on. There are some important performance notes to be made at this point. The first is that if each disc drive only has a device class of disc on it then the only class you'll be able to specify is disc. Therefore, it is a good idea to configure each disc with a unique device class i.e. disc1, for LDEV 1, disc2, for LDEV 2, etc. In order to spread out the disc I/o's and reduce access contention, it is a good practice to put masters and details on different disc drives. Due to their access method, masters tend to be very random and should be placed on non-cached discs. The access to details can range from being very sequential to very random depending on the volatility of the data entries and timeliness of proper maintenance procedures. The item names that make up the data set can be specified in any sequence. However, they must have been previously defined in the items section. There is no problem with having an item name that is identical with the set name. The path counter is used to identify both the data item that will serve as the key and the number of associated detail data sets linked to the master by the value of the key. A master can have from 0 to 16 paths to various detail data sets. A value of 0 simply is used to identify a key data item for a master with no detail data sets. The capacity of a data set can range from 1 to 2,147,483,647 data entries. When the data base is built, all disc space required to hold the specified number of entries is allocated and initialized to binary zeros. 4.5.2 Detail data set definition The detail data set is used to hold data records that occur multiple times for a given key value i.e. all order summaries for a given customer. The detail is accessed via a pointer from the associated master to a linked list of entries in the detail with the same key value. A detail unlike a master can have up to 16 keys, which are in fact pointers from associated masters. The keys in Automatic masters are added and deleted as records are added or deleted to associated detail data sets. The manual master entry for a given key value must be added by the user application prior to adding any detail records with the same key item value. An example detail data set specification would be: NAME: ORDER-SUMMARY, DETAIL (10/20), DISC2; ENTRY. ORDER-NO(ORDER-NO-MASTER), <1/4 path from master based on value of order-no >> CUSTOMER-NAME(!CUSTOMER-MASTER(TOTAL-DOLLARS)), <1/4 path from customer-master based on value of customer-name, additionally the list of all orders for the same customer-name will be maintained in sorted sequence by total-dollars >> TOTAL-DOLLARS; CAPACITY: 300000,1000,1000; << maximum capacity,[initial capacity [, increment]] >> The path names are simply the name of the associated master. A detail can have from 0 to 16 paths to the same or different master data sets. These paths serve to make subsets of records in the detail for given key values. Thus it is wise to choose paths that will yield a subset that is worth the effort of maintaining the pointers and chains. A good rule of thumb is that a path is only valuable if it can make a subset that is 15% or less of the detail records. Any greater than 15% and it becomes increasingly more advantageous to simply read all of the detail records selecting for the one you want. The sort item is used to specify maintenance of the list of data entries with the same key value in some sequence other than which they are added over time. THIS CAN BE EXTREMELY DANGEROUS if the records are not added in sorted sequence and the number of entries with the same key value is greater than 40 or 50. 4.6 End. The end statement is a throwback to the SPL programming language, its only purpose is to tell the schema compiler that this is the end of the program. However without the end statement the data base will not compile. Another note is that as will be discussed in a moment, the schema compiler requires file equations and run parameters to function properly. From time to time you will forget these complexities and run the DBschema program without them. One solution is to break and abort the program, another is to type END. and this will terminate the program. 4.7 Capacities & maximums 4.7.1 HP3000 system limitations A somewhat official operating system limits white paper can be found at: http://jazz.external.hp.com/papers/limits/os_limits.html 8190 maximum number of concurrent processes 2004 databases can be opened concurrently on a system 1140 processes can open one database in modes 1-4 (write)access when IMAGE logging is enabled. User logging is still all compatibility mode. 127 DBopens per process without calling DBclose 63 DBOPENs per process to open the same database 4.7.3 Data base level limitations: 1023 data items can be defined in a database 199 data sets can be defined in a database 63 passwords can be defined in a database (65 if 0 & ; count) 4.7.4 Data set limitations: 255 data items can be defined in a data set 16 paths can be defined in a master or detail data set 2,147,483,647 entries maximum in a data set 2,147,483,647 entries maximum in a chain 16,777,215 blocks per data set (recently this was changed from 8,388,607) 4.7.5 Data item limitations: 255 subitems can be defined in a data item 510 bytes for maximum subitem length 4096 bytes for maximum item length 4.7.6 Length limitations: 5120 bytes for maximum block length 252 bytes for maximum b-tree index key size 4.7.7 Transaction limitations: 15 databases can have simultaneous transactions (either of DBBEGIN/DBEND or DBXBEGIN/DBXEND transaction pairs) 4MB opened transactions for a process 64MB opened transactions for a volume set 4.8 DBschema The schema compiler is a program which reads the text file and compiles it into an object module called a root file. The compiler prints a heading, listing of the schema, and a summary table. The schema compiler is invoked by using the command: 4.8.1 Parameters and file equations RUN DBschema.PUB.SYS;PARM=n The parameter value of n is used to tell the compiler program what to do as far as input and output files are concerned. n = 1 ; the schema file has been file equated to DBSTEXT n = 2 ; the list " " " " " DBSLIST n = 3 ; both files have been equated as in 1 & 2 If no parameter is supplied, then the DBschema program will expect you to enter the schema at your terminal " a pretty good feat ), and you will have to break and abort or enter END. as described above. 4.8.2 Compile options The schema compiler allows for setting of compile options via a record in the text file that starts with $CONTROL, the form of the $CONTROL directive is: $CONTROL [ LIST NOLIST ] [ ,ERRORS= nnn ] [ ,LINES= nnnnn ] [ ,ROOT ,NOROOT ] [ ,BLOCKMAX= nnnn ] [ ,TABLE ,NOTABLE ] [ ,JUMBO ,NOJUMBO ] Where: LIST causes the source text to be printed to the list file, this can be a problem if your schema is long and you ran DBschema;PARM=1. The default is to list all of the text file as the schema is compiled. NOLIST specifies that only text file records with errors are listed. This can be handy for processing long text files and for keeping the content of a data base secret. ERRORS=n sets the maximum number of errors to n. The default value is 100. When the schema processor reaches this number of errors, it gives up and terminates. This is very useful for saving paper and your pride. LINES=n sets the number of lines per page for the list file, the default is 60 lines per page but this number can range from ' to 32767. ROOT tells the schema compiler to create the root file if no errors are detected. This is the default mode. NOROOT tells the compiler to compile the schema but not to create the root file even if no errors are detected. This option is very useful for testing out modifications to the schema before purging the existing data base, and saves the trouble of having to go into DButil to purge the root. BLOCKMAX sets the size of the largest physical block of records for each data set. The default is 512 words or 2560 words or 20 sectors. DBschema determines the number of media records that fit in a block. Note that DBSCHEMA chooses a block size (less than or equal to the maximum block size) that makes the best use of disk space, which often is substantially less than the BLOCKMAX. If the record size is greater than 512 words, BLOCKMAX must be set greater than or equal to the record size. In general, a setting of 2560 will allow DBschema to choose the best blocking factor. JUMBO allows a data set whose capacity generates a file greater than 4GB, to automatically become a jumbo data set. Be aware that many utilities and diagnostic tools are not yet upgraded to handle Jumbo data sets. IT IS ALSO IMPORTANT TO NOTE THAT THE JUMBO OPTION WORKS BY SPLITTING THE DATASET INTO MULTIPLE FILES THAT RESIDEING THE POSIX HFS. THESE FILE MAY NOT GET BACKED UP AND RESTORED WITH EXISTING JOB JCL SPECIFICATIONS! NOJUMBO the default disallows data sets greater then 4GB. There are two other 2 compiler directives, the $TITLE and $PAGE commands are used to format the schema listing for documentation purposes. The form of the $PAGE command is: $PAGE [["some character string"], ] The $PAGE command causes the compiler to page eject and print the content of the character string, if any, at the top of the next page. The character string will override the $TITLE command. The $TITLE command is used to specify a string of characters to be printed as a heading for each page printed. It does not cause a page to be ejected as is the case with the $PAGE command. An example of the $TITLE command would be: $TITLE [["some character string"],] The $TITLE command stays in effect until a subsequent $TITLE or $PAGE command is processed. If no character string is included, then no heading is printed. 4.9 Interpreting the compiler listing Errors are listed as the schema file is being scanned. Most errors will be due to lack of attention paid to punctuation in the form of :,, and .'s. If there are no errors, the message UNREFERENCED ITEMS. may appear. While items defined in the items section which are not specified in any of the data sets are not considered as errors, they should be removed to decrease the overhead of processing the item names list. The summary table at the end of the schema listing is very important to understand and retain, as it represents the physical realities of the data base, and some of the information cannot be found elsewhere. DATA SET NAME - Is the name given to the set in the schema SETS. section. The MPE file name will be the data base name with a counter that starts at 01 and counts up as each data set appears sequentially in the schema. TYPE - Is a one letter denotation: "A" for Automatic masters, "M" for manual masters, and "D" for detail data sets. FLD CNT - Is a counter of the number of data items in each data set. PT CT - Is a path counter. For master data sets this is the number of paths that appears in ()'s after the search item. For details this is a count of the number of associated masters, specified by putting the master set name in ()'s after the linking data item. ENTR LGTH - The length in words of the data portion for the data entry. This is the sum of all the individual data items specified for the data set. MED REC - This is the total length of the Image media record. This includes data, pointers and other internal information. The length of the media record is an important design consideration and will be covered in detail later. CAPACITY - For masters this is the exact number of entries specified in the schema in the SETS section. For details, this number may be rounded up to fill out the last full block. BLK FAC - This is the best fit of the number of media records that could fit under the BLOCKMAX specified, or the hard coded number specified in the capacity. BLK LEN - This is the actual length of the block. Any difference between this value and the next multiple of 512 is wasted disc space. The total number of wasted sectors can be calculated by multiplying the number of blocks by this difference. The total number of blocks is simply the capacity divided by the blocking factor, rounded up by one. DISC SPACE - The total number of sectors including the root file that this data base will require when the data base is created. Note that for very large data bases it is usually a good practice to check if enough disc space is available before proceeding. ROOT LENGTH - This is the length in words of the compiled root file. The size of this varies with the number of passwords, data items, sets and security specifications. BUFFER LENGTH - is the length in words of the largest block it is unused by TurboIMAGE/XL but provided for compatibility with TurboIMAGE on previous versions. TRAILER LENGTH - This is the length in words of an area that is used by Image to move data base access parameters to and from the calling program's stack. Schema processing errors are documented in Appendix A of the Image manual. Errors tend to build on each other as the compiler trips over itself, so don't let your pride get hurt when many errors are reported. Two helpful hints: one is to specify data items uppercase characters and have a handy sorted list of them as you build the data sets. Second, pay attention to the punctuation. After all, the schema is a program and therefore highly cantankerous when it comes to syntax. 4.10 DButil After the schema has been successfully compiled, the DBschema program will leave behind a root file. While interesting, this file is of little useful value until the data bases are created. The data base is built using the utility program DButil.PUB.SYS, as we see in the following discussion DButil is a general utility that allows the data base administrator to configure and maintain the data base. There are many commands available in DButil and they can be categorized them into four groups, the create and purge commands, the show command, the enable command and the set command. 4.10.1 Create & Purge commands The CREATE command initializes each data set and saves them as privileged MPE files in the same group as the root file. To create the data base, you must be logged on with the same user account group that was used when the DBschema program was executed. The format of the CREATE command is: CRE[ATE] data base name [/maintenance word] The data base name is the same as the root file name which is the same as the base name specified in the BEGIN statement of the SCHEMA. The maintenance word is a password that can be used to allow users other than the creator of the data base to use a limited set of DButil commands. This user must be logged on into the group in which the data base resides. For large data bases, it is wise to check that enough disc space is available in sufficient chunks before executing the create command. If the create operation fails due to lack of disc space, two problems must be dealt with. First the data sets built must be purged using the PURGE command, and second, free space must be found or added in sufficient quantities to reattempt the build operation. The PURGE command removes the data base files from the system directory, and once entered, the data in the data base is forever lost. So extreme caution is recommended before using this command, always make sure that you are logged on where you think you should be because many a production data base has been purged because someone thought that they where logged on in the test account! The format of the PURGE command is: PUR[GE] data base name[/maintenance word] 4.10.2 Show command Once the data base is built the default configuration can be displayed with the SHOW command. The SHOW command can provide a list of users accessing the data base, any users waiting or holding locks on the data base, logging status, configuration flags, and the buffer allocations. The format of the SHOW command is: SH[OW] database name[/maint word] ALL [OFFLINE] BUFFSPECS CAPACITY CIUPDATE DEVICE FLAGS INDEX LANGUAGE LOCKS LOGID LOGINFO MAINT PASSWORDS SUBSYSTEMS USERS MAINT - Displays the maintenance word if any. This is very useful if you have forgotten what you originally set it to. ALL - Displays the combination of MAINT, BUFFSPECS, LOCKS, USERS, LOGID, AND FLAGS. This is the parameter that is used most frequently and supplies a neat little report on the configuration of the data base. BUFFSPECS - Displays the current buffer specification. The buffer specifications will be discussed in detail in the SET command. CAPACITY - Displays a capacity report for the data base. For each data set it shows the current number of entries, the percent of maximum, and for a detail data set the dynamic expansion parameters. CIUPDATE - Displays the status of critical item update. DISALLOWED means that DBupdate can not modify a search or sort item. ALLOWED means that after a call to DBcontrol that a program can use DBupdate to modify search or sort items. And, ON means that Dbupdate can modify search and sort items. DEVICE - Displays the device assignment for each data set. FLAGS - Displays the various states of the data base configurable parameters. Each parameter is discussed in detail in the SET and ENABLE commands. LOCKS - Displays the users who are currently holding or requesting locks for the data base, data sets or data items. This can be very insightful when deadlock situations occur. This happy event will be discussed in detail when we get to the DBlock intrinsic. LOGID - Displays the current data base log identifier if any. USERS - Displays the users and application programs if any, which are currently accessing the data base. PASSWORDS - Displays the user class table and current password settings. This is handy if you forget what the passwords to the data base were and don't happen to have the schema. SUBSYSTEMS - Displays the flag that allows or disallows the data base for access by Query and other applications that check this flag. OFFLINE - Sends the information to the line printer with the formal file designator of DBUTLIST. 4.10.3 Enable and Disable of configuration flags The enable and disable commands turn on and off internal flags for access, dumping, logging, ILR and recovery. These flags are very important to understand from the standpoint that they control the way that Image operates on the data base as a whole. The form for the enable and disable commands is: EN[ABLE] database name[/maint word] FOR parm[[,parm]...] Valid parameters are: ACCESS DUMPING ILR LOGGING RECOVERY ROLLBACK AUTODEFER MUSTRECOVER PREFETCH INDEXING HWMPUT SHADOWING DIS[ABLE] database name[/maint word] FOR parm[[,parm]...] ACCESS - Allows or disallows user, or application access to a data base. This is very useful for securing the data base for periods of time i.e. weekends and holidays. It protects the data from being read, written or deleted. It does not protect from an accidental restore of a different version of the data base. DUMPING - This flag, when enabled, tells Image, upon abort, to copy the users stack, image buffers and other binary information to an "I" file created in the group where the session or job is logged on. It is very rare to see an abort while in an Image subroutine. However, an Image abort usually means that the user or application program was attempting to access the database and ran into some type of corrupted data. In this case the "I" file would be of assistance in debugging the type and extent of the damage. This flag defaults to the disabled state and remains that way until a problem situation is encountered. HWMPUT - See http://www.adager.com/TechnicalPapersHTML/HwmPut.html ILR - Flag for Intrinsic Level Recovery. This facility of Image allows the backing out of the last record written or deleted to the data base if the system failed prior to the completion of the operation. In other words, the intermediate state of the data base when a record is being added or deleted is inconsistent, and if the operating system or hardware should fail, halt or abort at this critical stage, we could be left with a damaged data base, unless ILR is enabled. Fortunately, HP3000's tend to be very reliable systems, and ILR is overkill for most data bases. However, ILR is recommended for data bases that contain data which the financial wellbeing of your company rests and cannot be recovered manually, i.e. online order entry, billing etc. The drawback to ILR is that it adds additional processing and disc I/o's to each record added or deleted, to the tune of 10% to 15%. If you are experiencing a rash of system problems, it is wise to enable ILR until things settle down. LOGGING - Setting this flag is part of a series of steps to turn the logging of transactions to disc or tape media on and off. This procedure will be discussed in detail later. PREFETCH - IMAGE is currently 'single threaded' when it comes to calling intrinsics that change pointer information. This means that only a single process is executing a call to DBput, DBdelete, or DBupdate (of a critical item) at a time. This methodology helps to insure against situations that create conflicts with pointers that would result in broken chains. In order to accomplish this, IMAGE uses a flag known as the PUT/DELETE semaphore to freeze access. This freeze is obtained at the beginning of the call and released at the end, and is held while entries and pointers are updated. The Prefetch flag tells IMAGE when to obtain the semaphore freeze. With Prefetch disabled, (the default), IMAGE obtains the semaphore before reading, updating and writing every data block which is to be modified by the call. After this is successfully completed the semaphore is released. With Prefetch enabled, IMAGE reads all the data blocks that are required for the call before the semaphore is obtained. It is the holding of the semaphore for a shorter period of time which could allow more concurrent calls but it is not guaranteed. On systems with memory contention there is the possibility that the necessary blocks may be flushed out of memory by another process before IMAGE is able to update them. This could cause the blocks to be thrashed in and out and is why the default is not to Prefetch. Thus to gain throughput from Prefetching the system must have memory and CPU available to handle the increased processing, and the application mix would have to consist of multiple processes consisting of mostly calls to DBput, DBdelete or DBupdate. RECOVERY- Enables or disables the running of the program that is used to apply transactions from the log file. Setting this flag is part of the recovery procedure, which will be discussed later. This flag defaults to the disabled state and protects from corruption for the data base by an inadvertent attempt to reapply a transaction log file. 4.10.4 Set parameters The Set command is used to configure the run time variables of the data base. These differ from the parameters of the enable/disable command, in that they are more complex than a simple on/off status. However, as in the enable/disable commands these variables can significantly affect the behavior of the data base. The format of the Set command is: SET database name[/maint word] BUFFSPECS= num bufs(from/to) LOGID=logid-value MAINT=[maintenance word] SUBSYSTEMS = NONE | READ | RW PASSWORD j=[password] LANGUAGE=langid CIUPDATE = [DISALLOWED ] [ALLOWED ] [ON ] BTREEMODE1 = [, [WILDCARD=]c] MAINT - This parameter is used exclusively by the creator of the data base to set a new maintenance word for the data base. The default is no maintenance word, so if it is desirable to allow other users who can log on into the data base's group, access to the DButil commands then this is the command to set a maintenance word. Note, other utilities, such as ADAGER, also make use of the maintenance word in this fashion. If no maintenance word is provided, then the existing maintenance word if one exists is removed. BUFFSPECS - is backward compatibility only. LOGID - Is the MPE log identifier used when the GETLOG command was issued. Use of this parameter is part of the setting up of the logging procedure and is discussed in that full context later. PASSWORD - Allows the changing of the data base passwords. The user class number must be used in the read/write security specification at the item or set level for the new password to have any effect. Usually, data base passwords are hard coded into applications and therefore changing them in this fashion can have drastic results on the ability of the user or application to open the data base. This, however is a good way to test the error logic build into the application, to see what happens when the data base cannot be successfully opened. SUBSYSTEMS - A parameter that can be used to allow or disallow read and/or write access on an application by application basis. This parameter can be checked programmatically via the DBinfo intrinsic and is built into QUERY. So in short, this is a way to control QUERY access for each data base. 4.10.5 ERASE The erase command is executed to reinitialize the entire data base to its created state. This is the time saving equivalent of purging and rebuilding the entire data base. A prime design consideration for splitting one data base into multiple data bases is that a data base can be erased many times faster than the individual records can be deleted. However, you can only take advantage of this if the records to be deleted reside entirely in their own Image data base. The format of the erase command is: ERA[SE] data base name[/maintenance word] NOTE. DButil DOES NOT ASE ARE YOU SURE" SO BE AWARE THAT ALL DATA WILL BE WIPED OUT WHEN YOU ENTER THIS COMMAND- IT IS A VERY WISE PRACTICE TO CHECE YOUR LOG ON WITH A SHOWME BEFORE ERASING THE WRONG DATA BASE. 4.10.6 EXIT The exit command gets you back to MPE. Almost all utilities allow you to enter E to exit, however to be consistent with the three letter command notation standard, you must enter EXI to exit. 4.10.7 Dbschema walk through :DBschema PAGE 1 HEWLETT-PACKARD 30391C.07.04 TurboIMAGE/3000: DBSCHEMA FRI, NOV 6, 1998, 9:11 AM (C) HEWLETT-PACKARD CO. 1987 >BEGIN DATA BASE TEST; >PASSWORDS: >10 READER; >20 WRITER; > >ITEMS: >ADDRESS-LINE-1, X40 (10/); >ADDRESS-LINE-2, X40 (10/); >ADDRESS-LINE-3, X40 (10/); >CITY, X40 (10/); >CUSTOMER-NAME, X40 (10/); >ORDER-NO, X02 (10/); >STATE, X02 (10/); >TOTAL-DOLLARS, X10 (10/); >ZIP, X10 (10/); > >SETS: >NAME: CUSTOMER-MASTER, M (10/20); > >ENTRY: CUSTOMER-NAME(1), > ADDRESS-LINE-1, > ADDRESS-LINE-2, > ADDRESS-LINE-3, > CITY, > STATE, > ZIP; >CAPACITY: 5; > >NAME: ORDER-NO-MASTER, A (10/20); > >ENTRY: ORDER-NO(1); >CAPACITY: 5; > >NAME: ORDER-SUMMARY, D (10/20); > >ENTRY: ORDER-NO(ORDER-NO-MASTER), > CUSTOMER-NAME(CUSTOMER-MASTER), > TOTAL-DOLLARS; >CAPACITY: 300000,1000,1000; >END. DATA SET TYPE FLD PT ENTR MED MAXIMUM BLK BLK DISC NAME CNT CT LGTH REC CAPACITY FAC LGTH SPACE CUSTOMER-MASTER M 7 1 106 117 5 3 352 16 ORDER-NO-MASTER A 1 1 1 12 5 5 61 16 ORDER-SUMMARY D 3 2 26 34 300000 15 511 272 INITIAL CAPACITY = 1005 INCREMENT ENTRIES = 1005 TOTAL DISC SECTORS INCLUDING ROOT: 320 NUMBER OF ERROR MESSAGES: 0 ITEM NAME COUNT: 9 DATA SET COUNT: 3 ROOT LENGTH: 652 BUFFER LENGTH: 511 TRAILER LENGTH: 256 ROOT FILE TEST CREATED. :dbutil HP30391C.07.04 TurboIMAGE/XL: DBUTIL (C) COPYRIGHT HEWLETT-PACKARD COMPANY 1987 >>create test Database TEST has been CREATED. >>show test all For database TEST Maintenance word is not present. Access is enabled. Autodefer is disabled. Dumping is disabled. Rollback recovery is disabled. Recovery is disabled. ILR is disabled. Mustrecover is disabled. Logging is disabled. Prefetch is disabled. Indexing is disabled. HWMPUT is disabled. Restart is disabled. Database has never been stored. Database has been modified since last store date. Shadowing is disabled. Subsystem access is READ/WRITE. CIUPDATE is disallowed. Dynamic capacity expansion is used for 1 data sets. Database has no B-trees. Logid is not present. XM log set : this database is not attached to an XM log set. The language is 0 - NATIVE-3000. Buffer specifications: 8(1/2),9(3/4),10(5/6),11(7/8),12(9/10),13(11/12),14(13/14), 15(15/16),16(17/18),17(19/120) No other users are accessing the database. ------------------------------------------------------------------- 5.0 What are the Data Base Access Intrinsics? The data base is programmatically accessed via calls to intrinsics that contain parameters that are passed back and forth. The parameters along with the intrinsic called act as commands to Image to perform the specified task. It will be much easier if you think of the individual intrinsics as commands that come in pairs. Just like the seven dwarfs, happy-grumpy,sleepy-Sneezy, dopey-doc, and bashful. The Image intrinsic commands can also be paired, DBopen-DBclose, DBlock-DBunlock, DBerror-DBexplain, DBput-DBdelete, DBget-DBfind, and DBupdate. Another good discussion/overview of the intrinsics can be found at: http://www.robelle.com/smugbook/imageapi.html 5.1 Parameters and data types Before getting into the discussion of the actual commands, we can cover some of the general housekeeping issues. How is the intrinsic referenced in the different programming languages. What are some of the common parameters passed back and forth. And, most importantly, how can the program tell if its call was successful or not The syntax for the intrinsic procedure call is different for various languages. The general form is presented here. See the program examples included for more detail. COBOL CALL "DBxxxx" USING parm1, parm2, parm3 ... FORTRAN CALL DBxxxx (parm1, parm2, parm3 ... ) SPL DBxxxx (parm1, parm2, parm3 ... ) BASIC 100 CALL XDBxxxx (parm1, parm2, parm3 ... ) PASCAL DBxxxx (parm1, parm2, parm3 ... ) The xxxx indicates the various commands available. The parameters vary slightly with each command and are discussed in the next section. 5.1.1 Data base parameters The three most common parameters are the base name, the mode, and the status, which are required for most intrinsic calls. The base name is the name given in the BEGIN statement of the data base schema. The mode parameter has different meanings for different Image calls, and for some calls the value is always the same. This is done for consistency and compatibility with older versions of Image. +----------------------------------------------------------------------+ | INTRINSIC| Parameters passed to and received from Image | | COMMAND | UPPER case are input parameters, lower case are returned | +----------------------------------------------------------------------+ | DBopen | BASE | PASSWORD | MODES | status | | | | | ------ | NAME | | (1-8) | array | | | | |----------------------------------------------------------------------| | DBclose | base | DATA SET | MODES | status | | | | | ------- | name | NAME OR # | (1-3) | array | | | | |----------------------------------------------------------------------| | DBlock | base | QUALIFIER | MODES | status | | | | | ------ | name | | (1-6) | array | | | | |----------------------------------------------------------------------| | DBunlock | base | DUMMY | MODE | status | | | | | -------- | name | PARAMETER | = 1 | array | | | | |----------------------------------------------------------------------| | DBerror | | | | STATUS | | buffer | | | ------- | | | | ARRAY | | | | |----------------------------------------------------------------------| | DBexplain| | | | STATUS | | | | | ---------| | | | ARRAY | | | | |----------------------------------------------------------------------| | DBput | base | DATA SET | MODE | status | LIST | BUFFER | | | ----- | name | NAME OR # | = 1 | array | | | | |----------------------------------------------------------------------| | DBdelete | base | DATA SET | MODE | status | | | | | -------- | name | NAME OR # | = 1 | array | | | | |----------------------------------------------------------------------| | DBget | base | DATA SET | MODE | status | LIST | buffer | ARG | | ----- | name | NAME OR # | (1-8) | array | | | VALUE | |----------------------------------------------------------------------| | DBfind | base | DATA SET | MODE | status | ITEM | | ARG | | ------ | name | NAME OR # | = 1 | array | | | VALUE | |----------------------------------------------------------------------| | DBupdate | base | DATA SET | MODE | status | LIST | BUFFER | | | -------- | name | NAME OR # | = 1 | array | | | | |----------------------------------------------------------------------| | DBbegin | base | LOGFILE | MODE | status | TEXT | | | | ------- | name | TEXT | = 1 | array | LEN | | | |----------------------------------------------------------------------| | DBend | base | LOGFILE | MODE | status | TEXT | | | | ----- | name | TEXT | = 1 | array | LEN | | | |----------------------------------------------------------------------| | DBinfo | base | QUALIFIER | MODE | status |BUFFER| | | | -------- | name | | | array | | | | |----------------------------------------------------------------------| | DBxbegin | base | QUALIFIER | MODE | status |BUFFER| | | | -------- | name | | | array | | | | |----------------------------------------------------------------------| | DBxend | base | QUALIFIER | MODE | status |BUFFER| | | | -------- | name | | | array | | | | +----------------------------------------------------------------------| 5.1.2 The status parameters The status parameter is the only parameter required for each and every call to Image. It is used to communicate the degree of successfulness and provide other useful data back to the calling program. The status parameter is a ten word array with the following format: Word 1, Zero if call was good or a number if something went wrong, negative numbers indicate bad parameters or conditions, positive error numbers indicate that the procedure encountered some exception condition. See appendix A of the Image manual for more details. The negative error numbers generally indicate some type of problem with the parameters being passed to the Image procedure. Problems with parameter number one range from -11 to -19, problems with parameter number two range from -21 to -29, ETC. For DBput, DBdelete, DBget, DBfind, and DBupdate: Word 2, The current entry length in words. Word 3/4, The current record number. Word 5/6, The chain length, returned by DBfind. Word 7/8, Backward pointer to the previous entry in the cabin. Word 9/10, Forward pointer to the next entry in the chain. For all the other intrinsics see individual descriptions. 5.2 DBopen & DBclose DBopen and DBclose are the first and last intrinsic commands passed to Image. They are used to begin and end the application's access to the data base. Just like any other file you must have at least read access to the data base you are attempting to open. 5.2.1 Description Upon calling DBopen Image: Opens the root file, but does not open any of the sets until referenced by the data set parameter in any of the other intrinsics. Uses the password provided to determine the security restrictions that will apply for subsequent calls under this DBopen. Uses the mode provided to determine allowable concurrent assessors. Upon calling DBclose Image: 5.2.2 Parameters The form of the DBopen intrinsic call is: DBopen (BASE,PASSWORD,MODE,status) BASE - Is a character string that contains the name of the data blank, then the data base name followed by a semicolon or a blank. If the data base is successfully opened, Image puts the data base ID word in the space you allocated in the first two bytes of the BASE array. This base ID identifies this DBopen call to Image for subsequent accesses. Therefore, if you overlay the base name variable in your program you will not be able to execute any further data base intrinsics under the current DBopen. To hard code the location of the data base or access a data base in another account, simply append the .GROUP and or .ACCOUNT to the string of characters you pass to image as the BASE parameter. PASSWORD - Is a string of characters corresponding to the passwords built into the schema followed by an optional user identifier. The password is used to establish the user's class number which is then mapped to the access capabilities at the set and item levels. The user identifier is used by the recovery program to assist in identifying users who log on under the same group and account. The construct is the left justified password followed by the optional user identifier followed by a space. If a semicolon is used the class number will be 63 if the user is logged on exactly as the user who used DBschema and DButil to create the data base. If the password provided matches one of the passwords in the password table, then the class number is mapped from the password table. If the password does not match any of the defined passwords, then a class number of zero is assigned and Image will check to see if this class number has been specified on any of the sets or items within those sets. MODE - Tells Image what type of access is requested, the valid modes are: 1 - Modify access with enforced locking, the other users may open only with modes 1 and 5. This is the most useful mode in that it allows all of the intrinsic calls. However, our password must allow read or write access to the specific sets or items requested. 2 - Update access same as mode 1 but disallows calls to DBput and DBdelete. The other users must open in either mode 2 or mode 6. I have never seen this mode used, probably included for completeness. 3 - Modify exclusive, same as mode 1 but since no one else is allowed to access the data base, no locking is required and no other concurrent modes of open are allowed. This mode has performance considerations for batch jobs that add or delete large numbers of entries. Mode 3 turns off the code required to share the data base and allows more efficient use of the buffers. 4 - Modify, allow concurrent read access. This is the same as one but no locking is required because the only other assessors allowed will be mode 6. 5 - Read, allow concurrent modify access. This is a read only mode that will allow any number of other mode 1 and ' assessors. This mode is the second most popular mode, it is used by report programs that only need to read and process the entries. 6 - Read, allow concurrent modify access. This is a read only mode that allows concurrent modes of 6, 2, 4, or 8. 7 - Read, exclusive. This mode would only be used to improve the performance of a large batch job that required the data to remain the same for the length of the procedure. 8 - Read, allow concurrent read access in mode 6. STATUS - Is a ten word array that returns information about the results of the intrinsic call, if the DBopen worked the status array would contain: Word Contents 1 The condition word of 0. 2 The user class number assigned from 0 to 64. 3 The current size of the global buffer area in words. 4 The current size of the user buffer area in words. 5/10 Not documented unless an error occurs Typical non-zero condition words: -11 Bad base parameter -21 Bad password parameter -31 Bad mode parameter -92 Data base not created The form of the DBclose intrinsic call is: CALL DBclose(BASE,DATA SET,MODE,status) BASE - Is the unmodified parameter used in the DBopen call. DATA SET - Is the name of the individual data set to be closed. This parameter's usage is dependent on the mode provided. MODE - Directs Image to close a set or the entire data base. A mode of 1 terminates access to the entire data base. A mode of 2 closes the data set referenced in the DATA SET parameter. And a mode of 3 reinitializes the data set referenced in the DATA SET parameter but the set will remain opened. Modes 2 and 3 are used to reposition the current record pointer so that a DBget of the next record will return the first record in the data set. STATUS - The condition word will contain a zero if the close was performed as requested. Common non-zero status words are similar to the DBopen. 5.2.3 Walk through NOTE. THAT DBDRIVER IS DOCUMENTED IN SECTION 6.5. In these walk through pay special attention to the CPU and elapsed times, these show the relative performance profiles for each type of Image data base call. :DBDRIVER DBDRIVER C.05.00 FRI, NOV 6, 1998, 9:23 AM (C) HEWLETT-PACKARD 1978 Command: !B=TEST Command: !Q=; Command: !M=1 Command: O Elapsed time (MS) = 287 CPU = 117 Baseid=%000001 Class=64 Current DBG size=32767 DBU size=10240 Command: ?S 0000 0040 7fff 2800 0000 0191 4163 88d4 0001 0000 Command: C Elapsed time (MS) = 21 CPU = 21 Command: ?S 0000 0040 7fff 2800 0000 1193 4163 88d4 0001 0000 5.3 DBlock & DBunlock Image leaves the locking of individual data entries or groups for entries totally up to the application designer and programmer. Locking allows shared access to the data base and strategies of locking are highly application dependent. One general rule does apply however, the larger the number of shared assessors, the smaller the group of entries locked. We have seen that the mode of DBopen can serve as a form of data base lock, however these exclusive access modes would not be appropriate for an application with more than one concurrent user. So, the DBlock and DBunlock commands allow programmatic locking of the entire data base, a specific data set, or a specific set of entries within a data set with a particular data item value. IT IS EXTREMELY IMPORTANT TO UNDERSTAND THAT THESE "LOCKS" ARE LOGICALLY HELD AND THAT THERE IS ABSOLUTELY NO CHECKING DONE TO SEE IF THE LOCKED ENTRIES ARE INDEED THE ONES BEING UPDATED. IT IS ALSO EXTREMELY IMPORTANT TO UNDERSTAND THAT ALL ACCESSORS MUST LOCK AT A GIVEN LEVEL IN ORDER TO RECEIVE THE BENEFITS OF GREATER CONCURRENT ACCESS. In other words, one application locking at the data base level will negate the effects of all of the other applications locking at the data set or data item level. 5.3.1 Description Upon calling DBlock Image: Will check the current lock list to see if a conflict exists. If no conflict exists, Image will add the lock request to the lock list. If a conflict exists, then one of two things will occur depending on the mode, which tells Image to wait or return. If the mode is set for unconditional locking, the program suspends and waits until all of the conflicting entries in the lock list have been unlocked. If a conditional lock mode was requested and a conflict exists, the status is returned to the calling program which can then choose to repeat the call or go on to other transactions. NOTE THAT THE ONLY OTHER WAY TO CLEAR A PROGRAM SUSPENDED UNCONDITIONALLY ON A DBlock IS TO RESTART THE SYSTEM, BUT AN UNCONDITIONAL LOCK IS THE ONLY WAY TO GET IN LINE FOR A LOCK REQUEST. Locks should never be held in a situation where the user may have to enter some data, instead, your application should lock around the logical transaction. More on this topic as we cover additional intrinsics. Upon calling DBunlock, Image will: Verify that the mode is set to 1, remove the calling programs lock from the lock list, and reactivate any programs which had been suspended because of this entry in the lock list. 5.3.2 Parameters The form of the DBlock intrinsic call is: Call DBlock(BASE,QUALIFIER,MODE,status) QUALIFIER - Is ignored in modes 1 and 2, is the data set name for modes 3 and 4, and is an array of lock descriptions for modes ' and 6. The format of the lock descriptor is a table within a table: General format: Word Contents: 1 The number of lock descriptors 2 The first lock descriptor n The second lock descriptor m The third lock descriptor Lock descriptor format: Word Contents 1 The length of the descriptor 2 An array containing the data set, data item and a relational operator and value of the item or items to be locked. MODE - Tells Image what level and whether to lock conditionally or unconditionally. The odd modes are unconditional and the even modes are conditional. Modes 1 and 2 lock at the data base level. Modes 3 and 4 lock at the data set level and modes 5 and 6 lock on the content of a data item or items. STATUS - The condition word for DBlock is always zero. Word two of the status area must be checked to see if the lock was actually granted as requested. For modes 1 through ' this value will be one if successful and zero if not. For modes ' and this value will depend on the number of lock descriptors coded in the request. Typical non-zero condition words: 20 Data base lock conflict (modes 2,4,6) 22 Data set lock conflict (modes 2) 23 Data entry lock conflict (mode 4) 24 Data item lock conflict (mode 6) 25 Entry or entries already locked (mode 6) The form of the DBunlock intrinsic call is: Call DBunlock(BASE,DATA SET,MODE,status) DATA SET - Is not used but is probably included for some planned functionality that was never added, or to be compatible with the number and types of parameters for the DBlock call. MODE - Must be set to 1. STATUS - The condition word for DBunlock is always zero Word two of the status area must be checked to see if the unlock was actually granted as requested. For modes 1 through 4 this value will be one if successful and zero if not. For modes 5 and 6 this value will depend on the number of lock descriptors coded in the original lock request. Data set locks released are counted as one descriptor. Typical non-zero condition words: -11 Bad BASE parameter -31 Bad MODE parameter 5.3.3 Walk through !M=1 Command: LOCK Elapsed time (MS) = 0 CPU = 0 Command: ?S 0000 0001 0000 2800 0000 1199 4163 88d4 0001 0000 Command: RELEASE Elapsed time (MS) = 0 CPU = 0 Command: ?S 0000 0001 0000 2800 0000 119a 4163 88d4 0001 0000 Mixed Mode Unconditional Locking Example Time| DBopen I | DBopen II | DBopen III | DBopen IV ----|---------------|---------------|---------------|-------------- 1 |DBlock MODE= 5 | | | |ORDER-SUMMARY | | | |ORDER-NUMBER= 1| | | |<> | | | ----|---------------|---------------|---------------|-------------- 2 | |DBlock MODE=3 | | | |ORDER-SUMMARY | | | |<> | | ----|---------------|---------------|---------------|-------------- 3 | | |DBlock MODE= 5 | | | |ORDER-SUMMARY | | | |ORDER-NUMBER= 7| | |<> |<> | ----|---------------|---------------|---------------|-------------- 4 |DBunlock | | | | |<> |<> | ----|---------------|---------------|---------------|-------------- 5 |DBlock MODE= 5 | | | |ORDER-SUMMARY | | | |ORDER-NUMBER= 2| | | |<> | |<> | ----|---------------|---------------|---------------|-------------- 6 | | | |DBlock MODE= 1 |<> | |<> |<> ----|---------------|---------------|---------------|-------------- 7 | |DBunlock | | |<> | |<> |<> ----|---------------|---------------|---------------|-------------- 8 |DBunlock | | | | | | |<> ----|---------------|---------------|---------------|-------------- 9 | | |DBunlock | | | | |<> 5.4 DBerror & DBexplain These intrinsics are used to format the status array into a buffer or the STDLIST of the application program. These error messages are meant for debugging purposes and should not be presented to the end user in any form. The better application programs will print an error listing to the system line printer so that as much information about the error condition can be accurately captured as possible. 5.4.1 Description Upon calling DBerror Image will: Use the status array to determine the error from a prior call and look up that error in a message catalog. Then load the text into the buffer provided and set the length parameter to the length of the text. This buffer then can be used to print a complete error incident report which should include the application name, the users current sign on, logical device and time of day. Upon calling DBexplain Image will: Use the status array to format an almost incomprehensible error message on the applications STDLIST device. This call should be used for debugging purposes only because a user would never be able to correctly copy down or relay the formatted message to anyone who could act on it. 5.4.2 Parameters The form of the DBerror Intrinsic is: DBerror(STATUS,buffer,length) STATUS - Is the status array from a previous Image call. BUFFER - Is the text error message returned by DBerror. LENGTH - Is the number of characters in the buffer to a maximum of 72. There is no status returned from the DBerror call so we can really never know if there were any errors in the DBerror intrinsic call. The form of the DBexplain call is: DBexplain(STATUS) STATUS - Is the status array from a previous Image call. There is no status returned from the DBexplain call, so again we really never know if this intrinsic worked. 5.4.3 Walk through Command: !b=BADDB Command: !Q=; Command: !M=1 Command: OPEN Elapsed time (MS) = 58 CPU = 43 NO SUCH DATABASE Command: ?S ffff 0000 0034 0d50 0000 0191 4163 88d4 0001 0000 Command: XPLAIN TURBOIMAGE ERROR AT ???????; RETURN STATUS = -1 DBOPEN, MODE 1, ON BADDB NO SUCH DATABASE HEX DUMP OF STATUS ARRAY FOLLOWS: ffff 0000 0034 0d50 0000 0191 4163 88d4 0001 0000 5.5 DBput & DBdelete Well, now that we can open and close the data base and explain the possible errors, we can get down to the business of getting data into and deleting data out of a data base. The DBput and DBdelete intrinsic calls can only be made to data bases opened in modes 1,3 or 4, otherwise known as modify access. The user class granted from the password must be on the write side of the read/write class list for the data set whose entries are being added or deleted. Data entries are individually added by calls to the DBput intrinsic. Entries are manually added to manual masters and details by calling DBput. Image automatically adds entries to automatic masters on an as required basis to serve as index records as detail entries are added to associated masters. For details indexed by manual masters, your application must add the manual entry prior to attempting to add the detail entry. This feature can be used to make sure that, for example, the vendor number exists prior to the adding of P.O.'s to a purchase order detail. When adding to a manual master, the search item must be unique, if an entry is already present with the same search item, an error is returned. To delete an entry, you must first read the entry with the DBget intrinsic. If the entry is a detail, and it is the only entry on a chain from an automatic master, then the master entry will also be deleted. If the entry to be deleted is in a manual master, then all associated detail chains must be empty for the delete to be successful. If the data base was opened in mode 1, then the DBdelete sequence must be accompanied by a preceding call to DBlock and followed by a call to DBunlock, to insure that another user does not delete the entry between when the record was read and when it was deleted. 5.5.1 Description Upon calling DBput, Image will: Check to see if the data set is an automatic master, a manual master or a detail data set. If it is an automatic master, Image will return an error message. If the DBput is to a manual master, Image will: Check to see if the search item is in the list specified, and see if the search item is unique by hashing the key. Check to see if a lock covers the data base, data set or some value of the search item if the open mode was 1. If the primary location is occupied, Image must determine if the entry at that location is a primary entry or a secondary entry on a secondary chain to another primary. If the entry at the hashed location was a primary, then the new entry is added to its secondary chain at the nearest open record. If the entry hashed to a location that was occupied by a secondary, then the secondary is moved to another location and the new entry is made a primary at its hashed location. If the DBput is to a detail, Image will: Check the free space chain of deleted entries. If any entries have been deleted, then the most recently deleted location is used. If no records have ever been deleted or all delete entries have been reused, then new entries are added at the EOF. Add the detail to all chains from associated masters by finding each associated master chain head and modifying the backward pointer and chain count of each master. Then finding the old last entry in the chain so that it now has a forward pointing to the new record in the chain. And finally, setting the forward and backward pointers of the new entry. If the detail data set specified that its chains were to be kept in sorted sequence, then instead of adding at the end of the detail chain, Image will read up the chain to find the logical point to add the record in the chain. This can produce applications which are spectacularly slow, and should only be specified if the average chain length is small or the records are added in the same sequence as the sort item. Upon a call to DBdelete, Image will: Check to see if there is a current data entry to be deleted, and if it is, a manual master or detail entry. If no current record is present Image will return an error. If the current entry to be deleted is a manual master entry, Image will: First, check to see if all of the chain heads are empty. If not, Image will return an error for the DBdelete call. The Image will identify the type of master entry to be deleted. The entry can be either a primary with no secondaries, a primary with secondaries or a secondary entry. To delete a primary entry with no secondaries, Image sets the entry to zeros, flips the bit in the bit map to 0, and updates the free space count in the label of the data set. To delete a primary entry that has secondary entries, Image finds the next entry in the secondary chain and migrates it to the primary location, overlaying the primary to be deleted, zeros out the secondary's old location, flips the bit in the bit map and updates the free space count on the label of the data set. To delete a secondary entry, Image hashes the key value of the secondary to find the primary entry, locates the preceding and following entries in the secondary chain, if any, and updates their forward and backward pointers, updates the secondary chain counter in the primary entry, flips the bit in the bit map and updates the free space count on the label of the data set. 5.5.2 Parameters The form of the DBput Intrinsic is: DBput(BASE,DATA SET,MODE,status,LIST,BUFFER) BASE - Is the unmodified parameter used in the DBopen call. DATA SET - Is the name of the individual data set to which the entry is to be added. MODE - Must be set to 1. STATUS - The condition word will contain a zero if the entry was added as requested. When successful, the status will also hold other interesting information: Word Contents 1 The condition word 2 Word length of the buffer array 3/4 Two word relative record number of the new entry 5/6 Count of the number of entries in the chain, if this was 7/8 a master, it is the number of secondaries, if this was a detail, it is the number of entries in the detail chain. 9/10 If this was a detail this is a pointer to the next entry on the chain, If a master its zeros. LIST - This is the list of data items to be added in the entry. This list can be the data item names, numbers, or a special character. The data item numbers are simply the number given in the sequence defined in the item table of the schema. Two special characters may be used, the * in the list tells image to use the identical list as was specified in the previous call, an A tells Image to use all of the data items for the data set. Both of these special lists save processing time because if data item names are specified, Image must look them up in the item table and cut and paste the contents of the buffer to match the actual entry data record. All of the search or sort items must be in the list provided or else Image will not be able to complete the DBput request. BUFFER - Is an array containing the values of the data items to be added. The individual fields of the buffer must correspond to the sequence and type of the items in the LIST. Typical non-zero condition words: -12 No lock covering entry to be added. -14 Cannot do a DBput in current DBopen mode. -21 Bad data set in DATA SET parameter. -23 Do not have write access to data set. -24 Cannot write to an Automatic master. -52 Bad LIST or item in list. -53 Missing search or sort item. 16 Data set is full (delete entries or resize) 43 Attempt to add duplicate master entry. 1xx Missing manual master entry on path number xx The form of the DBdelete Intrinsic is: DBdelete(BASE,DATA SET,MODE,status) BASE - Is the unmodified parameter used in the DBopen call. DATA SET - Is the name of the individual data set to which the entry is to be removed. MODE - Must be set to 1. STATUS - The condition word will contain a zero if the entry was removed as requested. When successful, the status will remain as it was after the preceding DBget: Word Contents 1 The condition word. 2 Zero. 3/4 Two word relative record number of the deleted entry. 5/6 Count of the number of entries in the chain, if this was 7/8 a master it is the new number of secondaries, if this was a detail it is not modified. 9/10 Not modified from the preceding call. Typical non-zero condition words: -12 No lock covering entry to be deleted. -14 Cannot do a DBdelete in current DBopen mode. -21 Bad data set in DATA SET parameter. -23 Do not have write access to data set. 17 No entry has been previously found. 44 Attempt to delete a master with detail chains. 5.5.3 Walk through Command: !Q=CUSTOMER-MASTER Command: !M=1 Command: !L=CUSTOMER-NAME Command: !E=TEST CUSTOMER1 Command: PUT Elapsed time (MS) = 41 CPU = 19 Command: ?S 0000 0014 0000 0005 0000 0001 0000 0000 0000 0000 Command: !E=TEST-CUSTOMER2 Command: PUT Elapsed time (MS) = 3 CPU = 2 Command: ?S 0000 0014 0000 0004 0000 0002 0000 0000 0000 0000 Command: DELETE Elapsed time (MS) = 2 CPU = 2 Command: ?S 0000 0000 0000 0004 0000 0002 0000 0000 0000 0000 5.5.4 Image master secondary chain processing NOTE: THIS SECTION NEEDS UPDATING FOR MASTER DYNAMIC EXPANSION! Secondary master entries happen when unlike search item values hash to the same relative record address. The frequency of this event is dependent upon the capacity of the master and the content of the search item. When this happens Image finds the nearest open entry using the bit map, and places this entry at that relative record number and identifies it as a secondary by chaining it to the entry that got put in the primary location first. A secondary chain consists of the first entry to hash to a given relative record address and all of its secondaries linked together with forward and backward pointers. It is important to understand the performance and practical implications of secondary chains because they can cause severe performance degradation and erroneous processing. The secondary chain head is made up of three fields, a count of the number of entries in the chain, a forward pointer to the next entry on the chain, and a backward pointer to the previous entry on the chain. If the count is 1 then the entry is a primary with no secondaries. If the count is 0 then the entry is a secondary to some other primary entry. And if the count is greater than 1 then the entry is a primary and the count is the number of secondary entries on the chain. An example of the ORDER-NO-MASTER after four entries have been added: ORDER-NO-MASTER Relative| Secondary Chain Head | Detail Chain Head | Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO --------|-----|--------|-------|-----|--------|-------|---------- 180 | | | | | | | --------|-----|--------|-------|-----|--------|-------|---------- 181 | 2 | 187 | 187 | 1 | 133 | 133 | 1 --------|-----|--------|-------|-----|--------|-------|---------- 182 | 1 | 0 | 0 | 1 | 2 | 2 | 9 --------|-----|--------|-------|-----|--------|-------|---------- 183 | | | | | | | --------|-----|--------|-------|-----|--------|-------|---------- 184 | | | | | | | --------|-----|--------|-------|-----|--------|-------|---------- 185 | 1 | 0 | 0 | 1 | 511 | 511 | 5 --------|-----|--------|-------|-----|--------|-------|---------- 186 | | | | | | | --------|-----|--------|-------|-----|--------|-------|---------- 187 | 0 | 0 | 0 | 1 | 358 | 358 | 7 --------|-----|--------|-------|-----|--------|-------|---------- 5.5.5 Migrating secondaries and DBput Suppose that an order number of 10 is to be added, and it hashes to relative record number 182. Relative record number 182 has a primary entry so Image will attempt to locate the new secondary entry in the same block as its primary and link them together via forward and backward pointers. ORDER-NO-MASTER Relative| Secondary Chain Head | Detail Chain Head | Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO --------|-----|--------|-------|-----|--------|-------|---------- 180 | 0 | 0 | 0 | 1 | 5 | 5 | 10 --------|-----|--------|-------|-----|--------|-------|---------- 181 | 2 | 187 | 187 | 1 | 133 | 133 | 1 --------|-----|--------|-------|-----|--------|-------|---------- 182 | 2 | 180 | 180 | 1 | 2 | 2 | 9 --------|-----|--------|-------|-----|--------|-------|---------- 183 | | | | | | | Now suppose that order number 22 is to be added, and it also hashes to relative record number 182. This new entry is also linked to the secondary chain in the following fashion. ORDER-NO-MASTER Relative| Secondary Chain Head | Detail Chain Head | Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO --------|-----|--------|-------|-----|--------|-------|---------- 180 | 0 | 0 | 183 | 1 | 5 | 5 | 10 --------|-----|--------|-------|-----|--------|-------|---------- 181 | 2 | 187 | 187 | 1 | 133 | 133 | 1 --------|-----|--------|-------|-----|--------|-------|---------- 182 | 3 | 183 | 180 | 1 | 2 | 2 | 9 --------|-----|--------|-------|-----|--------|-------|---------- 183 | 0 | 180 | 0 | 1 | 101 | 101 | 22 Migrating secondaries happen when a new entry's search item value hashes to a relative record occupied by a secondary entry, or when a primary entry with secondary entries is deleted. In the first example order number 2 is added and hashes to relative record number 183, which is already occupied by a secondary entry. In this case the secondary entry is moved to relative record number 18' so that the new entry can occupy its rightful primary location. ORDER-NO-MASTER Relative| Secondary Chain Head | Detail Chain Head | Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO --------|-----|--------|-------|-----|--------|-------|---------- 180 | 0 | 0 | 184 | 1 | 5 | 5 | 10 --------|-----|--------|-------|-----|--------|-------|---------- 181 | 2 | 187 | 187 | 1 | 133 | 133 | 1 --------|-----|--------|-------|-----|--------|-------|---------- 182 | 3 | 184 | 180 | 1 | 2 | 2 | 9 --------|-----|--------|-------|-----|--------|-------|---------- 183 | 1 | 0 | 0 | 1 | 1 | 1 | 2 --------|-----|--------|-------|-----|--------|-------|---------- 184 | 0 | 180 | 0 | 1 | 101 | 101 | 22 --------|-----|--------|-------|-----|--------|-------|---------- In this second example order number 1 is to be deleted, and this will cause the next entry in the secondary chain to move into the primary location. ORDER-NO-MASTER Relative| Secondary Chain Head | Detail Chain Head | Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO --------|-----|--------|-------|-----|--------|-------|---------- 180 | | | | | | | --------|-----|--------|-------|-----|--------|-------|---------- 181 | 2 | 187 | 187 | 1 | 133 | 133 | 1 --------|-----|--------|-------|-----|--------|-------|---------- 182 | 2 | 184 | 184 | 1 | 5 | 5 | 10 --------|-----|--------|-------|-----|--------|-------|---------- 183 | 1 | 0 | 0 | 1 | 1 | 1 | 2 --------|-----|--------|-------|-----|--------|-------|---------- 184 | 0 | 0 | 0 | 1 | 101 | 101 | 22 --------|-----|--------|-------|-----|--------|-------|---------- 5.6 DBget & DBfind This is where all of the discussion about access methods pays off. How information is retrieved from the data base is critical to the success of the application. Hopefully, you have a standard application that reads the data 70% of the time and writes it 30% of the time. In other words data is only writing out to disc if you are going to read it back more than once. The DBfind and DBget Intrinsics work together to allow rapid retrieval of data entries. As we'll see, Image uses directed, calculated, serial and chained access via these two intrinsics for all entry retrieval. DBfind is actually short for DBfindchainhead, remembering the long name will remind you that the DBfind intrinsic does not actually find any data. This intrinsic is solely used to set up the current path for subsequent chained access into a specific detail data set. 5.6.1 Description Upon calling DBget, Image will: Check to see if the data set is opened and if not open it. Then verify that the user class number has read access to all of the items in the list. The next step depends on the mode of the DBget: MODE = 1, Reread the current record: (masters & details) First, Image will determine if there is a current record from a previous DBget, if not, then an error will be returned. Then Image will check the bit map to the current record's block to see if a record at that location still exists. If so, then the record is read and the requested items are returned in the buffer. MODE = 2, Read the next valid record: (masters & details) Image will scan the bit maps in the current and subsequent blocks until an entry is found. Then the record is read and the requested items are returned in the buffer. MODE = 3, Read relative record number N: (masters & details) Image will read in the block containing the requested record, check the bit map to see if a record at that location exists and if so, read the record and return the requested items in the buffer. MODE = 5, Read the next record in the chain: (details only) Image will get the block of the next record on the chain which is the current path, check the bit map to see if a record exists at the forward pointer location and if so, read the record and return the requested items in the buffer. MODE = 6, Read the previous record in the chain: (details only) Image will get the block of the prior record on the chain which is the current path, check the bit map to see if a record exists at the backward pointer location and if so read the record and return the requested items in the buffer. MODE = 7, Read by calculating a relative record number (masters) Image will hash the argument provided generating a relative record number. The block containing that record is read in and the bit map is checked to see if a record exists at that location. If no record exists, then Image will return an error. If a record exists, then Image must read the record and compare the argument to the search item value of that record. If they match, then the requested items are loaded into the buffer. If they don't match, then Image must read each record on the secondary chain, matching on the argument and search item values. If no secondary entry matches, then Image returns an error. If a secondary does match, then Image loads the items requested into the buffer. MODE = 8, Read the primary entry (masters) This is identical to mode 7 except that the primary record will be returned even if a secondary argument is provided. This call was probably provided because this logic is required for a DBdelete of a secondary master entry. This saves chasing the secondary chain back to the primary entry to update the secondary chain count. Upon calling DBfind, Image will: First determine the appropriate master by matching the data item name specified to a unique master. The argument provided is hashed in a form of DBget mode=7 to locate the master entry which contains the chain head required. If no master entry exists, then an error is returned. If a master entry does exist, then the number of entries on the chain, and the forward and backward pointers are returned. 5.6.2 Parameters The form of the DBget Intrinsic is: DBget(BASE,DATA SET,MODE,status,LIST,buffer) BASE - Is the unmodified parameter used in the DBopen call. DATA SET - Is the name of the individual data set from which the entry is to be retrieved. MODE - Determines the access method to be used: 1 - Reread the current record, very useful for making sure that someone has not modified or deleted it before we attempt to do the same to someone else. 2 - Serially read the next entry in the data set. For masters this could cause a great number of IO's as the next valid entry can only be found by looking in the bit maps of the next blocks until one is found that has an entry. 3 - Reverse read the previous entry in the data set. For masters this could have the same adverse effects as the forward serial read. 4 - Use the argument parameter to read the entry at a specific relative record location. This is useful if you have stored the relative record number of the entries in another file. Query uses this trick when sorting a large file. Instead of building a sort file as large as the entire detail, it builds a sort record consisting of the sort items and the relative record number of the entry, then after the sort, goes back to get the rest of the data using directed DBgets. 5 - Chained read of the next entry in the detail chain. Must be preceded by a DBfind at some point to establish the current path. 6 - Reverse chain read of the previous entry in the detail chain. Must be preceded by a DBfind call to establish the current path. 7 - Calculated read to locate a master entry that matches the value in the argument parameter. 8 - Read of the entry that occupies the primary address of the value in the argument parameter. This was provided because the DBdelete intrinsic needed to be able to quickly get to the primary entry when deleting a master entry that happened to be a secondary. STATUS - The condition word will contain a zero if the entry was retrieved as requested. When successful, the status will also hold other interesting information: Word Contents 1 The condition word 2 Word length of the buffer array 3/4 Two word relative record number of the entry retrieved. 5/6 If this was a master, it is the number of secondaries, If this was a detail, it is set to zero. 7/8 If this was a detail, this is a pointer to the prior entry on the current chain, if a master, it's zeros. 9/10 If this was a detail, this is a pointer to the next entry on the current chain, if a master, it's zeros. LIST - This is the list of data items to be retrieved into the buffer. This list can be the data item names, numbers, or a special character. The data item numbers are simply the numbers given in the sequence defined in the item table of the schema. Two special characters may be used, the * in the list tells image to use the identical list as was specified in the previous call, an @ tells Image to use all of the data items for the data set. Both of these special lists save processing time because if data item names are specified, Image must look them up in the item table and cut and paste the contents of the buffer to match the actual entry data record. All of the search or sort items must be in the list provided or else Image won't be able to complete the DBput request. BUFFER - Is an array to receive the values of the data items. The individual fields of the buffer will correspond to the sequence and type of the items in the LIST. Typical non-zero condition words: -11 Bad BASE parameter. -21 Bad DATA SET parameter. -31 Bad MODE parameter. -52 Bad LIST or item in list. 10 Beginning of data set from a call in mode 3. 11 End of data set from a call in mode 2. 12 Tried to read before the first record from a call in mode 4. 13 Tried to read past the end of file from a call in mode 4 14 Reached the top of the chain from a call in mode 6. 15 Reached the end of the chain from a call in mode 5. 17 No entry at/for the argument value specified. 1, Broken chain, Image detected that the entry reached via a call in mode 5 or 6 does not belong on the current chain. This can happen quite easily if your application doesn't issue a lock covering the current chain, and another user deletes the entry your application "thinks" is the next entry. It can also happen for real and can be very difficult to fix. This is why we will discuss data logging and recovery in upcoming chapters. The form of the DBfind Intrinsic is: DBfind(BASE,DATA SET,MODE,status,ITEM,ARGUMENT) BASE - Is the unmodified parameter used in the DBopen call. DATA SET - Is the name of the individual data set from which the entry is to be retrieved. MODE - Must be 1 for a DBfind call. STATUS - The condition word will contain a zero if the chain head was retrieved as requested. When successful, the status will also hold other interesting information: Word Contents 1 The condition word 2/3 Zeros 5/6 The number of entries in the chain. This can be very useful in that the application can very quickly determine how many records will have to be processed to satisfy a given request. 7/8 Pointer to the relative record number in the detail data set of the last entry in the chain. 9/10 Pointer to the relative record number in the detail data set of the first entry in the chain. ITEM - This is the search item whose chain head we are looking for. The combination of this ITEM and the detail set name in DATA SET are all Image needs to determine the appropriate master to access and retrieve the chain head. ARGUMENT - Is the value of the data item named in the ITEM parameter. Typical non-zero condition words: -11 Bad BASE parameter. -21 Bad DATA SET parameter. -31 Bad MODE parameter. -52 Bad LIST or item in list. 17 No entry at/for the argument value specified. 5.6.3 Walk through 5.7 DBupdate & DBcontrol It used to be that DBupdate could only change the value of non-search and non-sort items of the current record. If a search or sort item must had to be changed then a call to DBdelete and DBput was required. This was one of the major design flaws of Image. To change a search item of one path into a detail, you were forced to delete the entire entry and re-add the entry to all paths. The difference between systems resources for the two types of operations is substantial. Now database access modes 1, 3, or 4, allow DBupdate to modify the values of detail data set search and sort items if permitted by the critical item update (CIUPDATE) option setting in combination with a call to DBcontrol. Master data set key item values cannot be modified with DBupdate. 5.7.1 Description Upon calling DBupdate, Image will: Check to see if the user class has write access to the data items in the list provided. Then Image will check to see if there is a current record. If not, Image will return an error. Image will then compare the value each item in the buffer to the value of each item in the record. If an item is different, then Image checks to make sure that if it is a search or sort item that CIUPDATE is allowed. 5.7.2 Parameters The form of the DBupdate Intrinsic is: DBupdate(BASE,DATA SET,MODE,status,LIST,BUFFER) BASE - Is the unmodified parameter used in the DBopen call. DATA SET - Is the name of the individual data set in which the entry is to be updated. MODE - Must be set to 1. STATUS - The condition word will contain a zero if the entry was updated as requested. Word Contents 1 The condition word 2 Word length of the buffer array 3/4 Same two word relative record number of the entry retrieved with the preceding DBget. 5/6 If the preceding DBget was a master, it is the number of secondaries. If this was a detail, it is set to zero. 7/8 If the preceding DBget was a detail, this is a pointer to the prior entry on the current chain, if a master, it is zeros. 9/10 If the preceding DBget was a detail, this is a pointer to the next entry on the current chain. LIST - This is the list of data items to be updated in the entry, this list can be the data item names, numbers or a special character. The data item numbers are simply the numbers given in the sequence defined in the item table of the schema. Two special characters may be used, the * in the list tells image to use the identical list as was specified in the previous call, an @ tells Image to use all of the data items for the data set. Both of these special lists save processing time, because if data item names are specified, Image must look them up in the item table and cut and paste the contents of the buffer to match the actual entry data record. BUFFER - Is an array containing the values of the data items to be added. The individual fields of the buffer must correspond to the sequence and type of the items in the LIST. Typical non-zero condition words: -12 No lock covering entry to be added in DBopen mode 1. -14 Cannot do a DBupdate in current DBopen mode. -21 Bad data set in DATA SET parameter. -52 Bad LIST or item in list. 17 No current entry, try a DBget first. 41 Attempt to update a search or sort item. 42 Attempt to update a read only item, check your user class number against the data set and item read/write lists. 5.7.3 Walk through DBDRIVER C.05.00 TUE, NOV 10, 1998, 3:27 PM (C) HEWLETT-PACKARD 1978 Command: !B=TEST Command: !M=3 Command: !Q=; Command: O Elapsed time (MS) = 285 CPU = 120 Baseid=%000001 Class=64 Current DBG size=32767 DBU size=10240 Command: !Q=ORDER-SUMMARY Command: G Elapsed time (MS) = 17 CPU = 16 030061 052105 051524 026503 052523 052117 046505 051040 0 1 T E S T - C U S T O M E R 020040 020040 020040 020040 020040 020040 020040 020040 020040 020040 020040 020040 020040 031062 030060 020040 2 2 0 0 020040 020040 Command: !E=02 Command: !M=1 Command: !L=ORDER-NO Command: UPDATE Elapsed time (MS) = 1 CPU = 1 DBUPDATE attempted to modify value of critical item--key, search or sort Command: !M=5 Command: KONTROL Elapsed time (MS) = 1 CPU = 0 Command: ?S 0000 0000 0000 0001 0000 319b 4163 88d4 0005 0000 Command: !M=1 Command: UPDATE Elapsed time (MS) = 14 CPU = 8 Command: EXIT 5.8 DBbegin & Dbend These intrinsics are used to bracket a logical transaction in the Image transaction log file. A logical transaction may consist of many Intrinsic calls to perform a particular task. Transaction logging recovery uses the DBbegin and DBend to determine if the entire logical transaction was completed. If DBRECOVER does not find a corresponding DBend to a DBbegin, it will not apply the partially completed logical transaction. Both intrinsics allow posting of a text message to the log file. This can be useful if you intend to build custom programs to process the log file to print reports. 5.8.1 Description Upon calling DBbegin, Image will: First check to see if DBbegin has been called twice without a call to DBend, you can only have one logical transaction going at a time. Then write a transaction begin record out to the log file. This record may contain text passed via the DBbegin call. 5.10.2 Parameters The form of the DBbegin Intrinsic is: DBbegin(BASE,LOGTEXT,MODE,status,TEXTLEN) BASE - Is the unmodified parameter used in the DBopen call. LOGTEXT - Is an array containing up to 512 characters or 25 words of binary data. This data is only used if you have a custom program for processing the Image transaction log file. MODE - Must be set to 1. STATUS - The condition word will contain a zero if the record was written as requested. Word Contents 1 The condition word 2 Whatever it was prior to this call TEXTLEN - The number of words in the LOGTEXT. Typical non-zero condition words: -151 TEXTLEN too large -152 Called DBbegin twice in a row wi