GitHub - DemolisherAA/Fradulent_Data_Detection_Apache: This repository contains Jupyter Notebooks related to fraud detection, data streaming, and real-time data visualization. These notebooks cover various aspects of processing, analyzing, and modeling data to address fraudulent transactions in eCommerce and other contexts.

Project Overview

This repository contains Jupyter Notebooks related to fraud detection, data streaming, and real-time data visualization. These notebooks cover various aspects of processing, analyzing, and modeling data to address fraudulent transactions in eCommerce and other contexts.

Files

Analysing Fraudulent Transaction Data.ipynb
- Purpose: Exploratory data analysis (EDA) of fraudulent transaction datasets.
- Key Components:
  - Analyzing patterns in fraudulent transactions.
  - Visualizing data distributions and key features.
  - Libraries used: pandas, matplotlib, seaborn.
Building Models for eCommerce Fraud Detection.ipynb
- Purpose: Building and evaluating machine learning models for fraud detection.
- Key Components:
  - Preprocessing data for model training.
  - Training and evaluating models such as Logistic Regression, Random Forest, etc.
  - Libraries used: scikit-learn, numpy, pandas.
Producing the Data.ipynb
- Purpose: Simulating and producing data streams for analysis.
- Key Components:
  - Generating mock data for fraud scenarios.
  - Producing data using streaming technologies.
  - Libraries used: faker, pandas.
Consuming Data Using Kafka and Visualise.ipynb
- Purpose: Consuming data streams and visualizing results.
- Key Components:
  - Setting up Kafka consumers to read data streams.
  - Visualizing the processed data for insights.
  - Libraries used: kafka-python, matplotlib.
Streaming Application Using Spark Structured Streaming.ipynb
- Purpose: Building a streaming application for real-time data processing.
- Key Components:
  - Setting up Spark Structured Streaming.
  - Processing streaming data in real-time.
  - Libraries used: pyspark.

Getting Started

Prerequisites

Python 3.x
Jupyter Notebook or Google Colab
Required Python libraries:
- pandas, numpy, matplotlib, seaborn
- scikit-learn, faker, kafka-python, pyspark

Installation

Clone the repository:
```
git clone <repository-url>
```
Navigate to the project directory:
```
cd <repository-folder>
```
Install the required libraries:
```
pip install -r requirements.txt
```

Usage

Open the notebooks in Jupyter or any compatible environment (e.g., Google Colab).
Follow the instructions within each notebook to execute the cells in sequence.

Datasets

The datasets used in this project are too large to include in the repository. Please email me at [your-email@example.com] to request access to the datasets.

License

This project is licensed under GNU (General Public License). See the LICENSE file for details.

Acknowledgments

Python documentation
Open-source libraries used in the project
Kafka and Spark community resources

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Analysing Fraudulent Transaction Data.ipynb		Analysing Fraudulent Transaction Data.ipynb
Building Models for eCommerce Fraud Detection.ipynb		Building Models for eCommerce Fraud Detection.ipynb
Consuming data using Kafka and Visualise.ipynb		Consuming data using Kafka and Visualise.ipynb
LICENSE		LICENSE
Producing the data.ipynb		Producing the data.ipynb
README.md		README.md
Streaming application using Spark Structured Streaming.ipynb		Streaming application using Spark Structured Streaming.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Overview

Files

Getting Started

Prerequisites

Installation

Usage

Datasets

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

DemolisherAA/Fradulent_Data_Detection_Apache

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Files

Getting Started

Prerequisites

Installation

Usage

Datasets

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages