Dbm (discuter | contributions) (→SQLfast) |
Dbm (discuter | contributions) (→SQLfast) |
||
Ligne 202 : | Ligne 202 : | ||
<br> | <br> | ||
− | :*<font color="black"><b>Case 1. Four hours to save the library</b>, draft version, <i> | + | :*<font color="black"><b>Case 1. Four hours to save the library</b>, draft version, <i>September 10, 2017.</i></font>[https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case01-Small-library.pdf [full text]] |
::''<b>Objective</b>'': This case study describes the emergency writing of a small application that was to implement the core functions of the management of a small library. The challenge was to replace the current software, lost in a recent crash of the server. All that was left after the accident was the last backup of the database, unfortunately in an unknown format. | ::''<b>Objective</b>'': This case study describes the emergency writing of a small application that was to implement the core functions of the management of a small library. The challenge was to replace the current software, lost in a recent crash of the server. All that was left after the accident was the last backup of the database, unfortunately in an unknown format. | ||
::''<b>Keywords</b>'': rapid application development, application prototyping, application architecture, GUI | ::''<b>Keywords</b>'': rapid application development, application prototyping, application architecture, GUI | ||
− | :*<b>Case 4. Schema-less databases - Part 1</b>, draft version, | + | :*<b>Case 4. Schema-less databases - Part 1</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case04-Schemaless-DB(1).pdf [full text]] |
::''<b>Objective</b>'': This document is the first of a series of three case studies that explore alternative data models that organize the data in structures that provide more flexibility than standard relational tables. We observe that this flexibility makes it easier to dynamically modify the schema of the database but, as an unfortunate consequence, the schema is less expressive. In some of these data models, the schema is practically devoid of any information, hence the name schema-less. | ::''<b>Objective</b>'': This document is the first of a series of three case studies that explore alternative data models that organize the data in structures that provide more flexibility than standard relational tables. We observe that this flexibility makes it easier to dynamically modify the schema of the database but, as an unfortunate consequence, the schema is less expressive. In some of these data models, the schema is practically devoid of any information, hence the name schema-less. | ||
::In this study, we explore two extreme data models. In the Universal table model, according to which the data of all the source tables are stored in a single table comprising the union of the columns of the source tables. In the second model, called Column-oriented model, each column of the source tables is implemented in an independent table. | ::In this study, we explore two extreme data models. In the Universal table model, according to which the data of all the source tables are stored in a single table comprising the union of the columns of the source tables. In the second model, called Column-oriented model, each column of the source tables is implemented in an independent table. | ||
::''<b>Keywords</b>'': non-relational data model, NoSQL, schema-less database, universal table model, universal relation, column-oriented data model, Cassandra, data migration, schema conversion | ::''<b>Keywords</b>'': non-relational data model, NoSQL, schema-less database, universal table model, universal relation, column-oriented data model, Cassandra, data migration, schema conversion | ||
− | :*<b>Case 5. Schema-less databases - Part 2</b>, draft version, | + | :*<b>Case 5. Schema-less databases - Part 2</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case05-Schemaless-DB(2).pdf [full text]] |
::''<b>Objective</b>'': This document studies a third family of alternative data models, namely the Key-Value data structure. In these models, the information is represented by triples that each defines the value of an attribute of an entity. Several approaches are described, with increasing levels of genericity. | ::''<b>Objective</b>'': This document studies a third family of alternative data models, namely the Key-Value data structure. In these models, the information is represented by triples that each defines the value of an attribute of an entity. Several approaches are described, with increasing levels of genericity. | ||
::''<b>Keywords</b>'': non-relational data model, NoSQL, schema-less database, key-value model, triple, triplestore, RDF, SPARQL, description logic, A-BOX, OWL, Redis, Berkley DB, data migration, schema conversion | ::''<b>Keywords</b>'': non-relational data model, NoSQL, schema-less database, key-value model, triple, triplestore, RDF, SPARQL, description logic, A-BOX, OWL, Redis, Berkley DB, data migration, schema conversion | ||
− | :*<b>Case 6. Schema-less databases - Part 3</b>, draft version, | + | :*<b>Case 6. Schema-less databases - Part 3</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case06-Schemaless-DB(3).pdf [full text]] |
::''<b>Objective</b>'': This document studies a fourth family of alternative data models, namely the object (or document) models. In these models, the information is represented by complex objects in which properties can be multivalued and composite. | ::''<b>Objective</b>'': This document studies a fourth family of alternative data models, namely the object (or document) models. In these models, the information is represented by complex objects in which properties can be multivalued and composite. | ||
::''<b>Keywords</b>'': non-relational data model, key-value model, object model, NoSQL, schema-less database, multivalued property, composite property, document-oriented DBMS, MongoDB, CouchDB, Azure, Datastore Oracle, metadata, index, data migration, schema conversion | ::''<b>Keywords</b>'': non-relational data model, key-value model, object model, NoSQL, schema-less database, multivalued property, composite property, document-oriented DBMS, MongoDB, CouchDB, Azure, Datastore Oracle, metadata, index, data migration, schema conversion | ||
− | :*<b>Case 9. Temporal databases - Part 1</b>, draft version, | + | :*<b>Case 9. Temporal databases - Part 1</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case09-Temporal-DB(1).pdf [full text]] |
::''<b>Objective</b>'': In this study we examine various ways to organize the data describing the evolution of a population of entities. The basic model consists in storing the successive states of each entity, completed by the time period during which the state was observable. We distinguish between the transaction time, that refers to the data modification time in the database and the valid time, referring to modification events of entities in the real world. This study particularly develops entity-based, attribute-based and event-based temporal database models. In these models, data management is ensured by triggers that automate as far as possible entity creation, modification and deletion operations. | ::''<b>Objective</b>'': In this study we examine various ways to organize the data describing the evolution of a population of entities. The basic model consists in storing the successive states of each entity, completed by the time period during which the state was observable. We distinguish between the transaction time, that refers to the data modification time in the database and the valid time, referring to modification events of entities in the real world. This study particularly develops entity-based, attribute-based and event-based temporal database models. In these models, data management is ensured by triggers that automate as far as possible entity creation, modification and deletion operations. | ||
::The next study will be devoted to temporal database querying. | ::The next study will be devoted to temporal database querying. | ||
::''<b>Keywords</b>'': temporal data type, temporal database, history, entity type, time point, time period, time interval, evolution, event, state, transaction time, valid time, system time, bitemporal data, document-oriented model, JSON, trigger | ::''<b>Keywords</b>'': temporal data type, temporal database, history, entity type, time point, time period, time interval, evolution, event, state, transaction time, valid time, system time, bitemporal data, document-oriented model, JSON, trigger | ||
− | :*<b>Case 11. Kings of France - Part 1</b>, draft version, | + | :*<b>Case 11. Kings of France - Part 1</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case11-Kings-of-France(1).pdf [full text]] |
− | ::'' | + | ::''<b>Objective</b>'': This study describes the French royal dynasty since Hughes Capet in 941. Its underlying goal is to study some properties and algorithms of widespread tree data structures. This first document of a series of two analyzes the dynasty of Kings of France, stores it in a database and extracts some simple information from it. The next study will be devoted to the derivation of more complex information. |
+ | ::''<b>Keywords</b>'': genealogy, tree, cyclic data structure, interval, ordering relation, temporal query, de Morgan law. | ||
− | :*<b>Case 12. Kings of France - Part 2</b>, draft version, | + | :*<b>Case 12. Kings of France - Part 2</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case12-Kings-of-France(2).pdf [full text]] [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case12-Kings-of-France-Draw.pdf [technical complement]] |
− | ::'' | + | ::''<b>Objective</b>'': In this chapter, we continue the exploitation of the KINGS database through more advanced tree processing applications, based notably on recursive scripts. The central concept from which most of these applications will derive is the transitive closure of table BRANCH, which comprises all direct and indirect ancestor/descendant couples. From it, we will build queries that count the descendants of a member, others that display the hierarchy of these descendants in various graphical way and a transitive reduction query that recovers the contents of table MEMBER from its closure. The last application, tree projection, extracts from table MEMBER a subset in which only kings appear.. |
+ | ::''<b>Keywords</b>'': genealogy, tree, cyclic data structure, transitive closure, transitive reduction, tree projection, recursive CTE, recursive query, tree drawing, tree traversal, depth-first traversal, breadth-first traversal, SQLdraw. | ||
− | :*<b>Case 14. The book of which you are the hero</b>, draft version, | + | :*<b>Case 14. The book of which you are the hero</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case14-Game-Books.pdf [full text]] |
− | ::'' | + | ::''<b>Objective</b>'': Game books are traditional text-based adventure games made up of a collection of pages (episodes) connected by references (branches). An episode comprises a text that describes a situation or an action and one or several branches that allow the gamer to jump to other episodes. Many of them are now available as pdf or html documents. In this study, we implement a simple game engine that automates such game books. This engine is based on a game database that can also be used to automatically generate stories. |
+ | ::Actually, this project is a nice opportunity to examine in some detail the concept of graph (a game book basically is a set of nodes and a set of edges) and to develop exploration and transformation algorithms. In particular, we study the structure of a game graph, we identify its abnomalies, we extract its circuits, we build and count the different possible runs from the starting episode to an exit episode, we search for unreachable episodes and dead-end branches and we identify episodes that can be merged. | ||
+ | ::A representative heroic fantasy game book has been encoded and all the algorithms developed in the study are provided as SQLfast scripts. | ||
+ | ::''<b>Keywords</b>'': computer game, game engine, story generation, graph, cyclic graph, acyclic graph, graph transformation, Marimont algorithm, reachability, circuit, elementary circuit, transitive closure, cyclic kernel, set comparison. | ||
− | :*<font color="black"><b>Case 15. Directory management</b>, draft version. | + | :*<font color="black"><b>Case 15. Directory management</b>, draft version. <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case15-Directory-Management.pdf [full text]]</font>. |
− | ::'' | + | ::''<b>Objective</b>'': The contents of storage media, such as hard disks and flash disks, both internal and external, are organized into a hierarchical structure made up of directories and files. |
+ | ::This chapter shows that, when such structures are stored in a database, processes can be designed easily to examine directories, to analyze their contents, to describe their evolution and to discover potential problems. In particular, small applications will be developed to extract statistics, to display the structure and contents of a directory, do identify and describe potentially duplicate files and directories within a root directory or between two directories. | ||
+ | ::The problem of fast clone detection, that is, of set of files that have exactly the same contents, is also analyzed and solved. | ||
+ | ::''<b>Keywords</b>'': directory structure, tree modeling, tree analysis, statistics, tree evolution, duplicate files, clone detection, secure hashing, SHA256, database performance, CTE, recursive queries. | ||
− | :*<b>Case 21. Test Database Generation</b>, | + | :*<b>Case 21. Test Database Generation</b>, obsolete version, <i>December 4, 2014.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case21-Test-databases.pdf [full text]] |
::''Chapter contents'': Database performance evaluation. Generating high volume of synthetic data. Integrating heterogeneous data sources. Data cleaning. Data anonymization. Random data extraction. Executing very large scripts. Query performance. | ::''Chapter contents'': Database performance evaluation. Generating high volume of synthetic data. Integrating heterogeneous data sources. Data cleaning. Data anonymization. Random data extraction. Executing very large scripts. Query performance. | ||
− | :*<b>Case 27. Conway's Game of Life</b>, draft version, | + | :*<b>Case 27. Conway's Game of Life</b>, draft version, <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case27-Life-Game.pdf [full text]] |
− | ::'' | + | ::''<b>Objective</b>'': This study is about games, worlds, life and death, borderline SQL applications and dramatic database optimization. The goal of the project is to implement the graphical animation of Conway’s cellular automata, aka Game of Life. A game of life is made up of an infinite array of cells in which live a population of small animals, each of them occupying one cell. The transition of one state of the population to the next one is specified by a set of simple computing rules. The goal of the game is to observe and study the evolution of the population. A game of life is implemented as a table in a database in which each row contains the coordinates and the content of a cell. The algorithms developed in this study load the initial state of a population then compute the next states thanks to the evolution rules. Finally, they visualize this evolution as an animated cartoon. The contribution of this study is twofold. It stresses the importance of database and algorithm optimization (the last version is 1,400 times faster than the first one) and it shows that relational databases and SQL may be quite efficient to develop matrix manipulation procedures (the SQL version is nearly 7 times faster than the equivalent Python program). |
+ | ::This study is also a tribute to E. F. Codd, the inventor of the relational model of databases, who first studied self-replicating cellular automata. | ||
+ | ::''<b>Keywords</b>'': cellular automata, replicating system, Conway, glider, Codd, matrix manipulation, algorithm optimization, database optimization, declarative algorithm, table indexing, in-memory database, CTE, recursive query, vector graphics, SQLdraw, animated simulation, Python. | ||
− | :*<font color="black"><b>Case 28. From data bulk loading to database book writing</b>, draft version, <i> | + | :*<font color="black"><b>Case 28. From data bulk loading to database book writing</b>, draft version, <i>September 10, 2017.</i></font>[https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case28-Topo-sort.pdf [full text]] |
− | ::'' | + | ::''<b>Objective</b>'': When data have to be loaded in a database from an external source, the order in which tables are filled is important as far as referential integrity is concerned. This order is determined by the directed graph formed by tables and foreign keys. From this graph one have to derive a linear ordering that represent one of the valid order in which table data are loaded. This derivation is called topological sorting, for which this chapter discusses and implements a simple algorithm. However, things are a bit more complex when the graph is not acyclic, as is often the case for database loading. Therefore, the chapter studies ways to transform a graph that includes circuits into a purely acyclic graph. These techniques are also applied to the ordering of topics when planning the writing of a book. |
+ | ::''<b>Keywords</b>'': data loading, database schema, (non) acyclic graph, topological sorting, strongly connected components, graph contraction, condensation of a graph, transaction management. | ||
− | :*<font color="black"><b>Case 31. Path finders, rovers and Ariadne's thread</b>, draft version. | + | :*<font color="black"><b>Case 31. Path finders, rovers and Ariadne's thread</b>, draft version. <i>September 10, 2017.</i> [https://staff.info.unamur.be/dbm/Documents/Tutorials/SQLfast/SQLfast-Case31-Shortest-path.pdf [full text]]</font>. |
− | ::'' | + | ::''<b>Objective</b>'': This chapter tackles a widespread optimization problem: computing the shortest path between two cities. The solving technique is based on Dijkstra’s algorithm. It also is applied to two similar applications domains, namely maze solving and controlling a rover on a hostile planet. A general purpose, application independent, solving tool is developed. |
+ | ::''<b>Keywords</b>'': optimization, shortest path, Dijkstra’s algorithm, maze solving, rover control. | ||
<br> | <br> | ||
<!-- ------------------------------------------------------------------------------ --> | <!-- ------------------------------------------------------------------------------ --> |
<Retour à la page d'accueil / Back>
Sommaire |
Case studies in preparation