Make sure you’ve done these steps first
- You’ve installed SQL Server with Python
- You’ve then installed pip
- You’ve also installed Pandas using pip
Then let’s get started
We’re going to use a Python library called Faker which is designed to generate test data. You’ll need to open the command line for the folder where pip is installed. In my standard installation of SQL Server 2019 it’s here (adjust for your own installation);
C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019PYTHON\PYTHON_SERVICES\Scripts
From here you want to run the following command to install mimesis;
1 |
pip install Faker |

Once it’s done we’ve got it installed, we can open SSMS and get started with our test data.
We’re going to get started with the sample queries from the official documentation but we have to add a print statement to see our results because we’re using SSMS;
1 2 3 4 5 6 7 8 9 10 11 |
DECLARE @sql nvarchar(max) SET @sql = N'#! usr/bin/env python3 from faker import Faker fake = Faker() print(fake.name())' execute sp_execute_external_script @language = N'Python', @script = @sql |
If you run this in SSMS you’ll see the output in the messages window

Now we know that works, let’s put this into a useable format within SQL Server.
This is going to be our block of Python;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
#! usr/bin/env python3 import pandas as pd from faker import Faker fake = Faker() # data dictionaries is how python stores tables data_dict = { "name":[], "address":[] } # we're inserting 100 random names and addresses into this dictionary for x in range(100): data_dict["name"].append(fake.name()) data_dict["address"].append(fake.address()) # using pandas to create a data frame makes it into a more presentable format output_data = pd.DataFrame(data_dict) # this is the data we're going to export to SQL Server OutputDataSet = output_data |
For the purposes of this example, we’re going to make a temp table to store the data and view what we’ve done. Wrapping this python script into t-sql will give us an output like so;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
DECLARE @sql nvarchar(max) SET @sql = N'#! usr/bin/env python3 import pandas as pd from faker import Faker fake = Faker() # data dictionaries is how python stores tables data_dict = { "name":[], "address":[] } # we''re inserting 100 random names and addresses into this dictionary for x in range(100): data_dict["name"].append(fake.name()) data_dict["address"].append(fake.address()) # using pandas to create a data frame makes it into a more presentable format output_data = pd.DataFrame(data_dict) # this is the data we''re going to export to SQL Server OutputDataSet = output_data' IF OBJECT_ID('tempdb..#FakeData') IS NOT NULL DROP TABLE #FakeData BEGIN CREATE TABLE #FakeData ( [name] varchar(max) ,[address] varchar(max) ) END INSERT INTO #FakeData (address, name) /* data comes out by column name alphabetically */ execute sp_execute_external_script @language = N'Python', @script = @sql SELECT * FROM #FakeData |
Go ahead and run it, you should see a sample of 100 names and addresses that are currently stored in your temp table;

There are far more options when using Faker. Looking at the official documentation you’ll see the list of different data types you can generate as well as options such as region specific data.
Go have fun trying this, it’s a small setup for a large amount of time saved.