diff --git a/README.md b/README.md index 890678483a4812d31e37f1ac1feab0ebd6cef65b..60a30b30560843b9da92a173f14da1022c1af911 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,23 @@ # SSN +In this repository, you can find the scripts needed to generate a Sequence Similarity Network starting from a curated sequence alignment, as described in [this chapter](https://doi.org/10.5281/zenodo.15228784). + +Our recommendation is to use the [Google Colaboratory version](#colab-notebook) below, since it automates the setup and execution. + +Otherwise, follow these local setup instructions. + ## Initial Setup -Clone this repository either via git CLI `git clone https://gitlab.tugraz.at/D5B8E35025578B91/ssn.git` or by manual download. +Clone this repository either via git CLI `git clone https://gitlab.tugraz.at/bioc/ssn.git` or by manual download. Check the input file format and then follow the instructions for the chosen analysis method. ## Input File - ### Preliminary steps -As described in the [chapter](https://www.youtube.com/watch?v=dQw4w9WgXcQ), the BLAST database can be built with: +The BLAST database can be built with: ```bash makeblastdb -in FILE.fasta -dbtype prot -title TITLE -parse_seqids -out DATABASE ``` @@ -20,12 +25,11 @@ The all-vs-all BLAST is performed with: ```bash blastp -db DATABASE -query FILE.fasta -out FILE.tsv -outfmt "6 qseqid sseqid evalue bitscore" ``` -FILE.tsv is the output file used in the subsequent analysis to geenrate the SSN. +FILE.tsv is the output file used in the subsequent analysis to generate the SSN. ## Analysis Scripts - ### AWK The AWK script is meant to be operated from a GNU/Linux system shell. @@ -36,7 +40,6 @@ It must be run as: ``` where the FILE.tsv is the input file, formatted as indicated in the **Input File** section, and the BITSCORES.csv and EVALUES.csv files containing the respective scores. - ### Python The python script requires a minimal python data analysis setup, with the pandas library to be installed via `pip install pandas -y` in your working environment. @@ -48,7 +51,8 @@ The analysis can then be launched from any shell as: where the FILE.tsv is the input file, formatted as indicated in the **Input File** section, and the BITSCORES.csv and EVALUES.csv files containing the respective scores. -### Colab Notebook +## Colab Notebook The colab notebook version requires no local setup and can be run from any browser (an already set up version requiring no login can be found [here](https://colab.research.google.com/drive/1RQssmD8X7ZOGaxOYDUYA5kpxmDIGYmkx)). -Upon visiting the [google colab website](https://colab.research.google.com/), log in, upload the IPYNB file and run it, following the instructions. + +If that version does not work, download the IPYNB file from this repository, log into the [google colab website](https://colab.research.google.com/), upload the file and follow the instructions.