Seed

Seeding is a powerful technique that ensures the sequence of random data generated by a provider is always the same, making the process deterministic and reproducible.

When Seeding is useful

By default, providers like first_name generate different, random values every time they are called:

from fake import FAKER

# Different results on successive calls
print(FAKER.first_name())  # e.g., 'Lars'
print(FAKER.first_name())  # e.g., 'Ka-Ping'
print(FAKER.first_name())  # e.g., 'Ben'

This behavior is typically desired for data generation, but it can be problematic when writing unit tests or attempting to debug an issue that occurred with a specific set of generated data.

When you call FAKER.seed(value) with an integer value (the seed), the internal random number generator is reset to the exact same starting point. This guarantees that subsequent calls to the providers will yield the exact same sequence of values:

  • Deterministic testing: Seeding allows you to write tests that rely on generated data. If the data is always the same, you can assert specific outcomes, making your tests reliable and robust.

  • Reproducible debugging: If a bug appears in a development or testing environment using generated data, re-running the data generation process with the same seed ensures you can recreate the exact data set that caused the failure, simplifying debugging.

The sequence generated is dependent on the seed:

FAKER.seed(42)

# First run with seed 42
print(FAKER.first_name())  # 'Steven'
print(FAKER.first_name())  # 'Ben'
print(FAKER.first_name())  # 'Andrew'
print(FAKER.first_name())  # 'Zooko'

FAKER.seed(42)  # Reset the seed

# Second run with the same seed (42) produces the same sequence
print(FAKER.first_name())  # 'Steven'
print(FAKER.first_name())  # 'Ben'
print(FAKER.first_name())  # 'Andrew'
print(FAKER.first_name())  # 'Zooko'

The following providers have consistent results when using seed:

  • first_name

  • first_names

  • last_name

  • last_names

  • name

  • names

  • username

  • usernames

  • slug

  • slugs

  • word

  • words

  • sentence

  • sentences

  • paragraph

  • paragraphs

  • text

  • texts

  • dir_path

  • file_extension

  • mime_type

  • tld

  • domain_name

  • free_email_domain

  • email

  • emails

  • company_email

  • company_emails

  • free_email

  • free_emails

  • url

  • image_url

  • pyint

  • pybool

  • pystr

  • pyfloat

  • pydecimal

  • ipv4

  • date

  • year

  • time

  • city

  • country

  • geo_location

  • country_code

  • locale

  • latitude

  • longitude

  • latitude_longitude

  • iban

  • isbn10

  • isbn13

  • random_choice

  • random_sample

  • randomise_string

  • string_template

Best practice

If you need to seed, it’s recommended to create yet another instance of Faker to avoid possible collisions with other parts of your application that might be relying on a separate, unseeded Faker instance.

from fake import Faker

FAKER = Faker()

You could then do as follows:

FAKER.seed(42)
l1 = [FAKER.pyint(), FAKER.pyint(), FAKER.pyint(), FAKER.pyint()]

FAKER.seed(42)
l2 = [FAKER.pyint(), FAKER.pyint(), FAKER.pyint(), FAKER.pyint()]

assert l1 == l2