Skip to content
Snippets Groups Projects
Commit 1f919f87 authored by Totaro Massimo G's avatar Totaro Massimo G
Browse files

readme update

parent 98a05c21
No related branches found
No related tags found
No related merge requests found
# SSN
In this repository, you can find the scripts needed to generate a Sequence Similarity Network starting from a curated sequence alignment, as described in [this chapter](https://doi.org/10.5281/zenodo.15228784).
Our recommendation is to use the [Google Colaboratory version](#colab-notebook) below, since it automates the setup and execution.
Otherwise, follow these local setup instructions.
## Initial Setup
Clone this repository either via git CLI `git clone https://gitlab.tugraz.at/D5B8E35025578B91/ssn.git` or by manual download.
Clone this repository either via git CLI `git clone https://gitlab.tugraz.at/bioc/ssn.git` or by manual download.
Check the input file format and then follow the instructions for the chosen analysis method.
## Input File
### Preliminary steps
As described in the [chapter](https://www.youtube.com/watch?v=dQw4w9WgXcQ), the BLAST database can be built with:
The BLAST database can be built with:
```bash
makeblastdb -in FILE.fasta -dbtype prot -title TITLE -parse_seqids -out DATABASE
```
......@@ -20,12 +25,11 @@ The all-vs-all BLAST is performed with:
```bash
blastp -db DATABASE -query FILE.fasta -out FILE.tsv -outfmt "6 qseqid sseqid evalue bitscore"
```
FILE.tsv is the output file used in the subsequent analysis to geenrate the SSN.
FILE.tsv is the output file used in the subsequent analysis to generate the SSN.
## Analysis Scripts
### AWK
The AWK script is meant to be operated from a GNU/Linux system shell.
......@@ -36,7 +40,6 @@ It must be run as:
```
where the FILE.tsv is the input file, formatted as indicated in the **Input File** section, and the BITSCORES.csv and EVALUES.csv files containing the respective scores.
### Python
The python script requires a minimal python data analysis setup, with the pandas library to be installed via `pip install pandas -y` in your working environment.
......@@ -48,7 +51,8 @@ The analysis can then be launched from any shell as:
where the FILE.tsv is the input file, formatted as indicated in the **Input File** section, and the BITSCORES.csv and EVALUES.csv files containing the respective scores.
### Colab Notebook
## Colab Notebook
The colab notebook version requires no local setup and can be run from any browser (an already set up version requiring no login can be found [here](https://colab.research.google.com/drive/1RQssmD8X7ZOGaxOYDUYA5kpxmDIGYmkx)).
Upon visiting the [google colab website](https://colab.research.google.com/), log in, upload the IPYNB file and run it, following the instructions.
If that version does not work, download the IPYNB file from this repository, log into the [google colab website](https://colab.research.google.com/), upload the file and follow the instructions.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment