Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,5 @@ RUN python -m pip install -v -r /home/${user}/RandomTelecomPayments/requirements
# set working directory for random telecom payments app
WORKDIR /home/${user}/RandomTelecomPayments

EXPOSE 8000
ENTRYPOINT ["python", "generator/main.py"]
60 changes: 28 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

## Overview

Randomly simulated data is particularly useful when it's real world counterpart is hard access due to complexity, privacy and security reasons. Moreover, randomly simulated data has additional benefits including reproducibility, scalability and controllability.
Randomly simulated data is particularly useful when it's real world counterpart is hard access due to complexity, privacy and security reasons. Moreover, randomly simulated data has additional benefits including reproducibility, scalability and controllability.

This application aims to simulate telecommunication payments using random number generation. It includes typical transaction level relationships and behaviours amongst the user, device, ip, and card entities. It can be used in place of real world telecommunication payments for prototyping solutions and as an education tool.
This application aims to simulate telecommunication payments using random number generation. It includes typical transaction level relationships and behaviours amongst the user, device, ip, and card entities. It can be used in place of real world telecommunication payments for prototyping solutions and as an education tool.

The data generation algorithm works by first generating user level telecom payments data. Afterwards, the user level data is exploded to transaction level, and any inconsistencies within the data model are removed. Finally, the transaction status and error codes are generated using underlying features within the transaction level data.

Expand All @@ -16,7 +16,7 @@ A stable master version of the Random Telecom Payments data can be found on Kagg

## Data Model

The underlying data model present in the simulated telecommunication payments is displayed below.
The underlying data model present in the simulated telecommunication payments is displayed below.

![Entity Relationship Diagram](doc/entity_relationship_diagram.jpg)

Expand All @@ -26,27 +26,16 @@ For a more detailed account of each column in the dataset see the data dictionar

## Running the Application (Windows)

### Anaconda

Create a local conda environment for the Random Telecom Payments app using [anaconda](https://www.anaconda.com/):

```
conda create --name RandomTelecomPayments python=3.12 --yes
conda activate RandomTelecomPayments
pip install -r requirements.txt
```

Execute the Random Telecom Payments app to generate data for 2000 users using the following command and the local conda environment:

```
python generator\\main.py --n_users 1000 --use_random_seed 1 --n_itr 2
```

View the generated Random Telecom Payments data using the following command:
### Application Parameters

```
type data\\RandomTelecomPayments.csv | more
```
* **n_users** - integer, the number of users to generate Random Telecom Payments data for, default is 100.
* **use_random_seed** - integer, whether to run the Random Telecom Payments data generation with or without a random seed set for reproducible results; must be 0 or 1.
* **n_itr** - integer, the number of Random Telecom Payments data batches to generate; must be at least 1. The python multiprocessing library is used to run each in parallel across all available cores.
* **n_applications** - integer, the number of applications to generate, default is 20000
* **registration_start_date** - string, the start date for user registrations, default is two years ago from today.
* **registration_end_date** - string, the end date for user registrations, default is one year ago from today.
* **transaction_start_date** - string, the start date for user transactions, default is one year ago from today.
* **transaction_end_date** - string, the end date for user transactions, default is today.

### Docker

Expand All @@ -60,6 +49,8 @@ The docker image can be pulled from dockerhub using the following command:
docker pull oislen/randomtelecompayments:latest
```

#### Command Line Interface

The Random Telecom Payments app can then be executed to generate data for 2000 users using the following command and the docker image:

```
Expand All @@ -69,15 +60,20 @@ docker run --name rtp oislen/randomtelecompayments:latest --n_users 1000 --use_r
The generated Random Telecom Payments data can then be extract from the docker image using the following command:

```
docker cp rtp:/home/ubuntu/RandomTelecomPayments/data/RandomTelecomPayments.csv %userprofile%\Downloads\RandomTelecomPayments.csv
docker cp rtp:/home/user/RandomTelecomPayments/data/RandomTelecomPayments.csv %userprofile%\Downloads\RandomTelecomPayments.csv
```

### Application Parameters
#### FastApi Interface

* **n_users** - integer, the number of users to generate Random Telecom Payments data for, default is 100.
* **use_random_seed** - integer, whether to run the Random Telecom Payments data generation with or without a random seed set for reproducible results; must be 0 or 1.
* **n_itr** - integer, the number of Random Telecom Payments data batches to generate; must be at least 1. The python multiprocessing library is used to run each in parallel across all available cores.
* **registration_start_date** - string, the start date for user registrations, default is two years ago from today.
* **registration_end_date** - string, the end date for user registrations, default is one year ago from today.
* **transaction_start_date** - string, the start date for user transactions, default is one year ago from today.
* **transaction_end_date** - string, the end date for user transactions, default is today.
Alternatively, a FastApi interface has been configured within the docker image to allow for interaction with the Random Telecom Payments app via REST API calls. The FastApi interface can be accessed by publishing port 8000 when running the docker image as follows:

```
docker run --name rtp --publish 8000:8000 --entrypoint fastapi --rm oislen/randomtelecompayments:latest run generator/api.py
```

Once the web endpoint is running, navigate to localhost:8000/docs in your preferred browser to access the FastApi interface documentation and test the available API calls.

* http://localhost:8000/docs


![FastApi Endpoint](doc/fastapi_endpoint.jpg)
95 changes: 95 additions & 0 deletions doc/RandomTelecomPayments.postman_collection.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
{
"info": {
"_postman_id": "ff7bfe7a-c3ca-4d73-a609-6094c57def45",
"name": "RandomTelecomPayments",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
"_exporter_id": "39794605"
},
"item": [
{
"name": "/api",
"request": {
"method": "GET",
"header": [],
"url": {
"raw": "http://127.0.0.1:8000/api",
"protocol": "http",
"host": [
"127",
"0",
"0",
"1"
],
"port": "8000",
"path": [
"api"
]
}
},
"response": []
},
{
"name": "/api?n_users=5&random_seed=1",
"request": {
"method": "GET",
"header": [],
"url": {
"raw": "http://127.0.0.1:8000/api?n_users=5&random_seed=1",
"protocol": "http",
"host": [
"127",
"0",
"0",
"1"
],
"port": "8000",
"path": [
"api"
],
"query": [
{
"key": "n_users",
"value": "5"
},
{
"key": "random_seed",
"value": "1"
}
]
}
},
"response": []
},
{
"name": "/api",
"request": {
"method": "POST",
"header": [],
"body": {
"mode": "raw",
"raw": "{\r\n \"n_users\": 1,\r\n \"use_random_seed\": 1,\r\n \"n_itr\": 1,\r\n \"n_applications\": 20000,\r\n \"registration_start_date\": \"2024-01-01\",\r\n \"registration_end_date\": \"2024-12-31\",\r\n \"transaction_start_date\": \"2025-01-01\",\r\n \"transaction_end_date\": \"2025-12-31\"\r\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "http://127.0.0.1:8000/api",
"protocol": "http",
"host": [
"127",
"0",
"0",
"1"
],
"port": "8000",
"path": [
"api"
]
}
},
"response": []
}
]
}
Binary file added doc/fastapi_endpoint.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions exeDocker.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ SET UBUNTU_DIR=/home/ubuntu
call docker run --name %DOCKER_CONTAINER_NAME% --memory 7GB --volume E:\GitHub\RandomTelecomPayments\data:/home/ubuntu/RandomTelecomPayments/data --rm %DOCKER_IMAGE% --n_users 100 --use_random_seed 1 --n_itr 1
:: call docker run --name %DOCKER_CONTAINER_NAME%w --memory 7GB --volume E:\GitHub\RandomTelecomPayments\data:/home/ubuntu/RandomTelecomPayments/data --rm %DOCKER_IMAGE% --n_users 13000 --use_random_seed 1 --n_itr 2
:: call docker run -it --entrypoint bash --name %DOCKER_CONTAINER_NAME% --memory 7GB --volume E:\GitHub\RandomTelecomPayments\data:/home/ubuntu/RandomTelecomPayments/data --rm %DOCKER_IMAGE%
:: call docker run --name %DOCKER_CONTAINER_NAME% --publish 8000:8000 --memory 7GB --entrypoint fastapi --rm %DOCKER_IMAGE% run generator/api.py

:: useful docker commands
:: docker images
Expand Down
120 changes: 120 additions & 0 deletions generator/api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
import json
from fastapi import FastAPI, Query
from typing import Annotated, Dict, List

import cons
from main import main
from utilities.JsonEncoder import JsonEncoder as JsonEncoder

tags_metadata = [
{
"name": "Random Telecom Payments Data Generator",
"description": "Generate random telecom payments data based on user-defined parameters.",
},
]

app = FastAPI(
title="Random Telecom Payments Data Generator API",
description="An API to generate random telecom payments data based on user-defined parameters.",
version="0.0.0",
openapi_tags=tags_metadata,
)

@app.get("/api", tags=["Random Telecom Payments Data Generator"])
async def get_api(
n_users: Annotated[int, Query(title="Number of Users", description="The number of users")] = cons.default_n_users,
use_random_seed : Annotated[int, Query(title="Use Random Seed", description="The random seed to use", ge=0, le=1)] = cons.default_use_random_seed,
n_itr : Annotated[int, Query(title="Number of Iterations", description="The number of iterations", ge=1)] = cons.default_n_itr,
n_applications : Annotated[int, Query(title="Number of Applications", description="The number of applications", ge=1)] = cons.default_n_applications,
registration_start_date : Annotated[str, Query(title="Registration Start Date", description="The registration start date in YYYY-MM-DD format")] = cons.default_registration_start_date,
registration_end_date : Annotated[str, Query(title="Registration End Date", description="The registration end date in YYYY-MM-DD format")] = cons.default_registration_end_date,
transaction_start_date : Annotated[str, Query(title="Transaction Start Date", description="The transaction start date in YYYY-MM-DD format")] = cons.default_transaction_start_date,
transaction_end_date : Annotated[str, Query(title="Transaction End Date", description="The transaction end date in YYYY-MM-DD format")] = cons.default_transaction_end_date,
):
"""
Generate random telecom payments data based on user-defined parameters.

Parameters
----------
n_users : int
The number of users.
use_random_seed : int
The random seed to use (0 or 1).
n_itr : int
The number of iterations.
n_applications : int
The number of applications.
registration_start_date : str
The registration start date in YYYY-MM-DD format.
registration_end_date : str
The registration end date in YYYY-MM-DD format.
transaction_start_date : str
The transaction start date in YYYY-MM-DD format.
transaction_end_date : str
The transaction end date in YYYY-MM-DD format.

Returns
-------
response : str
A JSON string containing the generated telecom payments data.
"""
# generate parameters dictionary
input_params_dict={
"n_users": n_users,
"use_random_seed": use_random_seed,
"n_itr": n_itr,
"n_applications": n_applications,
"registration_start_date": registration_start_date,
"registration_end_date": registration_end_date,
"transaction_start_date": transaction_start_date,
"transaction_end_date": transaction_end_date
}
# run random telecom payments generator
output_data_dict = main(input_params_dict=input_params_dict)
# convert transaction data to dictionary and then to json response
trans_data_dict = output_data_dict['trans_data'].to_dict(orient='records')
response = json.dumps(trans_data_dict, cls=JsonEncoder)
return response

@app.post("/api", tags=["Random Telecom Payments Data Generator"])
async def post_api(
body: Dict[str, object] = {}
):
"""
Generate random telecom payments data based on user-defined parameters.

Parameters
----------
body : Dict[str, object]
A dictionary containing the input parameters.
Possible keys are:
- n_users : int
The number of users.
- use_random_seed : int
The random seed to use (0 or 1).
- n_itr : int
The number of iterations.
- n_applications : int
The number of applications.
- registration_start_date : str
The registration start date in YYYY-MM-DD format.
- registration_end_date : str
The registration end date in YYYY-MM-DD format.
- transaction_start_date : str
The transaction start date in YYYY-MM-DD format.
- transaction_end_date : str
The transaction end date in YYYY-MM-DD format.

Returns
-------
response : str
A JSON string containing the generated telecom payments data.
"""
# generate parameters dictionary
input_params_dict={**cons.default_input_params_dict, **body}
# run random telecom payments generator
output_data_dict = main(input_params_dict=input_params_dict)
# convert transaction data to dictionary and then to json response
trans_data_dict = output_data_dict['trans_data'].to_dict(orient='records')
response = json.dumps(trans_data_dict, cls=JsonEncoder)
return response
4 changes: 4 additions & 0 deletions generator/app/gen_random_telecom_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,4 +114,8 @@ def gen_random_telecom_data(
fpath_countrycrimeindex=cons.fpath_countrycrimeindex
)

# map np.nans to None for JSON serialisation
user_data = user_data.where(pd.notnull(user_data), None)
trans_data = trans_data.where(pd.notnull(trans_data), None)

return {"user_data":user_data, "trans_data":trans_data}
11 changes: 11 additions & 0 deletions generator/cons.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,17 @@
default_registration_end_date = (date_today - datetime.timedelta(days=366)).strftime(date_date_strftime)
default_transaction_start_date = (date_today - datetime.timedelta(days=365)).strftime(date_date_strftime)
default_transaction_end_date = date_today.strftime(date_date_strftime)
# define default input parameters dictionary
default_input_params_dict = {
"n_users": default_n_users,
"use_random_seed": default_use_random_seed,
"n_itr": default_n_itr,
"n_applications": default_n_applications,
"registration_start_date": default_registration_start_date,
"registration_end_date": default_registration_end_date,
"transaction_start_date": default_transaction_start_date,
"transaction_end_date": default_transaction_end_date
}

# set unittest constants
unittest_seed = 42
Expand Down
1 change: 1 addition & 0 deletions generator/exeApi.cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
call fastapi run api.py
1 change: 1 addition & 0 deletions generator/exeApi.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fastapi run api.py
Loading