Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ This repo is meant to be used for sharing, in a version controlled way, both hea
- Per language requirements:
- Bash: calling the function without required parameters, or with -h or --help, should yield documentation
- Example reference: [count_cores](c4_utils/count_cores)
- Perl: calling the function without required parameters, or with -h or --help, should yield documentation
- Example reference: [metis_client_cmd_builder.pl](dl_utils/metis_client_cmd_builder.pl)
- R: documentation in `roxygen` syntax is heavily recommended, as well as publishing via `document::document()` to create a .txt file rendering (due to lack of language support for retaining documentation alongside of source()-acquired functions).
- Example reference: [essential_scripts/single_cell/density_plotter.R](essential_scripts/single_cell/density_plotter.R)
- Python: documentation in `sphinx` syntax is heavily recommended.
Expand Down Expand Up @@ -100,6 +102,7 @@ Note: All files deemed share-worthy can be referenced in this table, but only fi
| c4_utils/count_cores | a command line executable that allows a user to 1) self-monitor their active cores on `krummellab` nodes (default) or 2) use optional flags to query all DSCoLab active jobs to test for core monopoly | Rebecca | main |
| c4_utils/seff | a command line util that will collect time, core, and memory usages stats for a given job; dependency for `core_count` | Rebecca | main |
| bash_utils/ts_log.sh, python_utils/ts_log.sh, r_utils/ts_log.R | Scripts for bash, python, and R that yield a function for generating timestamped log messages, named `ts_log`, when sourced. | Dan | main |
| dl_utils/metis_client_cmd_builder.pl | A command line utility for generating scripts that can be used with metis_client based on return of an initial metis_client `find` command. One use case might be restricting all VCF files of a project. | Ravi | main |
| single_cell/density_plotter.R | When `source()`'d, defines an R function that plots density of clusters across the umap space | Dan | main |
| single_cell/annotation_import.R | When `source()`'d, defines an R function for pulling annotations into Seurat or SCE objects from a csv. An [example 'annots_file'](single_cell/annotation_import_example.csv) and [txt version of the function documentation](single_cell/annotation_import.txt) is also included. | Dan | main |
| single_cell/add_module_score_from_excel_gene_sets.R | R code for reading gene sets from an excel file, running Seurat::AddModuleScore, and visualizing the results | Dan | db/sc_module_score |
Expand Down
137 changes: 137 additions & 0 deletions dl_utils/metis_client_cmd_builder.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
#! /usr/bin/perl

use strict;
use warnings;
use POSIX qw(strftime);

sub print_help {
print <<'END_HELP';
Usage: perl metis_client_cmd_builder.pl [OPTIONS]

Options:
--client-path=PATH Specify the path to the metis_client
--project=NAME Name of the project to work on. E.g., "xhlt2"
--file-ext=EXT File extension(s) to use for matching files. Multiple file extensions can
be separated by spaces (must enclose by quotes). E.g., ".fastq.gz .fq.gz"
--cmd=COMMAND Define the metis_client command to execute for each matching file
--cmd-suffix=SUFFIX Suffix to append to the command. E.g., "." to append the "put" command. The
same values is applied to each command.
--help Show this help message and exit

Example:
perl path/to/metis_client_cmd_builder.pl --client-path=/c4/home/$USER/metis/bin/metis_client --cmd=restrict --file-ext=".vcf.gz .vcg.gz.tbi" --project=demo
perl path/to/metis_client_cmd_builder.pl --client-path=/c4/home/$USER/metis/bin/metis_client --cmd=get --file-ext=".fastq.gz .fq.gz" --project=demo --cmd-suffix='.'
END_HELP
exit;
}

sub parse_args {
my @args = @_;
my %options;

# Define allowed and required options
my %allowed = map { $_ => 1 } qw(client-path cmd cmd-suffix file-ext project);
my %required = map { $_ => 1 } qw(client-path cmd file-ext project);

foreach my $arg (@args) {
if ($arg eq '--help' or $arg eq '-h') {
print_help();
}
elsif ($arg =~ /^--(\w[\w-]*)=(.*)$/) {
my ($key, $value) = ($1, $2);
if ($allowed{$key}) {
$options{$key} = $value;
} else {
warn "Unknown option ignored: --$key\n";
}
} else {
warn "Invalid argument format (expected --key=value): $arg\n";
}
}

# Check for missing required options
my @missing = grep { !exists $options{$_} } keys %required;
if (@missing) {
die "Missing required option(s): " . join(', ', @missing) . "\nUse --help to see usage.\n";
}

return %options;
}


sub random_string {
my $length = shift || 10;
my @chars = ('A'..'Z', 'a'..'z', 0..9);
my $str = '';
$str .= $chars[int rand @chars] for 1..$length;
return $str;
}

sub make_timestamp {
return strftime("%Y-%m-%d_%H.%M.%S", localtime);
}

my $timestamp = make_timestamp();

#my $rand_str = random_string(10);

my %options=parse_args(@ARGV);
my $p = $options{'project'};
my $client_path = $options{'client-path'};
my $cmd = $options{'cmd'};
my $cmd_suffix = "";
$cmd_suffix = $options{'cmd-suffix'} if(exists $options{'cmd-suffix'});
my $file_ext = $options{'file-ext'};
my @file_exts = split(" ", $file_ext);

print "###############################################\n";
print "Input options:\n";
for my $i (keys %options) {
print " $i=$options{$i}\n";
}
my $metis_client_cmds_file = "$p-$timestamp.cmds";
print "Output:\n";
print " metis_client command file=$metis_client_cmds_file\n";
print "###############################################\n";

# Check if the janus token is set
print "Checking Janus token and input parameters..... ";
my @token_check = `$client_path metis://$p $cmd 2>&1`;
if($token_check[0] =~ /No environment variable TOKEN is set/ or $token_check[0] =~ /No project is selected/ or $token_check[0] =~ /Your token is expired/ or $token_check[0] =~ /Invalid command/ or $token_check[0] =~ /No such file or directory/) {
print "\nError:\n";
print @token_check;
exit 1;
}
print "Active Janus token is found and input parameters are valid..... Done.\n";

print "Finding files that match the input file extension(s)..... ";
open(O, ">$metis_client_cmds_file");
my @files = ();
for my $f_ext (@file_exts) {
open(I, "$client_path metis://$p/ find name=~%$f_ext |");
my @tmp_files = <I>;
push @files, @tmp_files;
close(I);
}
print "Found ", $#files+1, " files matching '$file_ext'..... Done.\n";

print "Building metis_client commands..... ";
my $dir = "";
for my $f (@files) {
chomp $f;
next if($f =~ /No results/);
if($f =~ /:/){
$f=~s/://g;
$dir = $f;
}
else {
my $file="/$dir/$f";
print O "$cmd '$file' $cmd_suffix\n";
}
}
close(O);
print "Done.\n";
print "-----------------------------------------------\nRun the following command to run metis_client commands:\n";
print " $client_path metis://$p < $p-$timestamp.cmds > $p-$timestamp.log 2> $p-$timestamp.err\n";