Populating the Employee Database with Test Data

Proceeding with the example of an employee database, this time we will figure out how to populate a database with test data.

Evgeniy Gribkov
Towards Data Science

--

Photo from Piqsels (CC0)

In this article, we will examine the process of populating the employee database with dummy data, whose schema we designed in the previous article. Filling a SQL database with dummy data can be very useful when we want to run some tests. The most convenient way is the population of SQL tables with random data with the help of visual data generation tools.

Generating data with the help of Data Generator for SQL Server

For this purposes we’ll use Data Generator for SQL Server tool which is integrated into SSMS and is also included in dbForge Studio. It should be noted that realistic test data is generated based on column names, dimensions, and data types. Apart from this, the relationships between tables are also taken into account, as the process of data generation depends on them.

To open this component, right-click “New Data Generation…” on the necessary database in SSMS:

Img. 1. Running the Data Generator for SQL Server tool in SSMS

If you are using dbForge Studio, in the main menu, choose Tools\New Data Generation…:

Img.2. Running the Data Generator for SQL Server tool in dbForge Studio

In the resulting window, on the “Connection” tab, you can see the current MS SQL Server instance and the database selected for data generation, which can be edited (if necessary). Then click the “Next” button:

Img.3. Setting the “Connection” tab

Next, on the “Options” tab, let’s set the options of data generation for the database:

Img.4. Setting data generation options

Note that you can generate SQL test data in different modes:

  1. By specified number of rows (1000 rows by default)
  2. By proportion of existing data in the volume of percent (10 % by default)
  3. By generation of data by time (10 seconds by default)

You can also clear data before generation by setting the “Truncate data from table before generation” parameter.

You can set the value distribution mode in one of the following ways:

  1. Random by timestamp
  2. Random by seed (1 by default)
  3. Sequential

Also, you can set column properties:

  1. Set values to be unique
  2. Include NULL values (10% of rows by default)
  3. Include empty values (10% of rows by default)

You can save the settings to a .bat file by pressing the “Save Command Line…” button located on the lower left of the data generation settings window.

After you are finished with the settings, on the lower right of the data generation settings window, press the “Open” button.

You will then see a progress bar showing the table metadata loading. After that, the window with detailed data generation settings for each selected table appears:

Img.5. Detailed data generation settings for each selected table

On the left, you should select the tables and columns you want to populate, and on the right, you should set the table generation mode for the selected table.

At the same time, below are the instances of generated data (note that they represent real names).

In the top right corner, there is a button of data generation settings that were described above.

To start the data generation process, click on the green arrow at the top center of the window. Then, you will see the window for selecting additional settings. Here, on the Output tab, you need to select exactly where to generate the data, in the form of a script, save it to a file or to a database. Let us select the last option and press Next:

Img. 6. Setting the “Output” tab

Then, you can set additional parameters on the Options tab. In this case, you need to uncheck database backup options and press “Next”:

Img.7. Setting the “Options” tab

On the “Additional Scripts” tab, you can set additional scripts. In our case, we just press “Next”:

Img.8. Setting the “Additional Scripts” tab

On the “Summary” tab, we can see the information about settings and also warnings. Here, you can also save all settings as a .bat file, by pressing the bottom left “Save Command Line…” button. To run the data generation process, you need to press the “Generate” button:

Img.9. The general information and warnings on the Summary tab

The window of data generation process appears:

Img.10. Data generation process

Then, the tables will be populated with data. For instance, the Employee table has the following generated data:

Img.11. The examples of generated data in the Employee table

Conclusions

To sum up, we populated the database with realistic data for testing both functionality and load. It is possible to generate much more random data for load tests. In addition to that, the very process of testing can be accelerated by means of the dbForge Unit Test tool.

What is more, through the use of SQL data generation, you can calculate not only a database growth rate but also a query performance difference that results from the data volume increase.

Originally published at https://blog.devart.com on July 22, 2020.

--

--

I am interested in everything related to the database and data. Professionally engaged in MS SQL Server as a developer and administrator.