Uncategorized

You are currently browsing the archive for the Uncategorized category.

Rstudio-server deployment

After installing the two server with super user previledges, we can’t directly run the servers because the path to R has not been set.

The configuration file of rstudio-server lies at /etc/rstudio/rserver.conf
We shall append the following line:

rsession-which-r=/home/csj/anaconda3/envs/r411py37/bin/R

More options, such as port, external lib paths, could also be set in the rserver.conf file (see details at https://support.rstudio.com/hc/en-us/articles/200552316-Configuring-RStudio-Workbench-RStudio-Server)

Run the following command to verify if rstudio-server was configured successfully.

sudo rstudio-server verify-installation

we can manage the rstudio-server with the following commands
sudo rstudio-server status
sudo rstudio-server start
sudo rstudio-server stop
sudo rstudio-server restart

The default port of rstudio-server is 8787.

 

Shiny-server deployment

run the following code first before starting the server
export R=/home/csj/anaconda3/envs/r411py37/bin

the path to R shall be configured at /etc/shiny-server/shiny-server.conf

we can change the configurations here

# Define a server that listens on port 3838
server {
listen 3838;

# Define a location at the base URL
location / {

# Host the directory of Shiny Apps stored in this directory
site_dir /srv/shiny-server;

# Log all Shiny output to files in this directory
log_dir /var/log/shiny-server;

# When a user visits the base URL rather than a particular application,
# an index of the applications available in this directory will be shown.
directory_index on;
}
}

start the server with
sudo systemctl start shiny-server

https://nymrli.top/2019/02/24/%E6%90%AD%E5%BB%BAfrp%E6%9C%8D%E5%8A%A1-%E9%98%BF%E9%87%8C%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8/

vi frps.ini

[common]
bind_port = 7000
vhost_http_port = 8080

[common]部分是必须有的配置,其中bind_port是自己设定的frp服务端端口,vhost_http_port是自己设定的http访问端口。

./frps -c ./frps.ini

——————————————————————————–
vi frpc.ini

[common]
server_addr = x.x.x.x
server_port = 7000

[ssh]
type = tcp
local_ip = 127.0.0.1
local_port = 22
remote_port = 6000

[nas]
type = http
local_port = 5000
custom_domains = no1.sunnyrx.com

[web]
type = http
local_port = 80
custom_domains = no2.sunnyrx.com

上面的配置和服务端是对应的。

[common]中的server_addr填frp服务端的ip(也就是外网主机的IP),server_port填frp服务端的bind_prot。

[ssh]中的local_port填群晖的ssh端口。

[nas]中的type对应服务端配置。local_port填群晖的DSM端口。custom_domains为要映射的域名,记得域名的A记录要解析到外网主机的IP。

[web]同上,local_port填群晖的web端口。这里创建了两个http反向代理是为了分别映射群晖两个重要的端口,5000和80,前者用于登录群晖管理,后者用于群晖的Web Station和DS Photo。

保存配置,输入以下指令运行frp客户端。(同样如果需要在后台运行,请往下翻阅关于后台运行的部分。)

./frpc -c ./frpc.ini

向量数据库
Milvus https://github.com/milvus-io/milvus
Milvus向量数据库知乎介绍 https://zhuanlan.zhihu.com/p/393699963
向量数据库知乎介绍 https://zhuanlan.zhihu.com/p/40487710

Single-cell
只用单个细胞的信息做分类CellID https://zhuanlan.zhihu.com/p/392992024
Next Generation Genomics https://underline.io/events/165/reception
Clustered DotPlot https://divingintogeneticsandgenomics.rbind.io/post/clustered-dotplot-for-single-cell-rnaseq/
Clustered DotPlot https://davemcg.github.io/post/lets-plot-scrna-dotplots/
增强的单细胞绘图,支持Seurat https://bioconductor.org/packages/release/bioc/vignettes/dittoSeq/inst/doc/dittoSeq.html#561_dittoDotPlot
漂亮的DotPlot https://www.biostars.org/p/484150/
HieRFIT: A hierarchical cell type classification tool https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/btab499/6320801?redirectedFrom=fulltext
Subpopulation mapping https://www.researchgate.net/publication/329421271_Accurate_sub-population_detection_and_mapping_across_single_cell_experiments_with_PopCorn
Iterative clustering https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8009055/
Challenges in single-cell clustering https://www.nature.com/articles/s41576-018-0088-9#Sec2

Generative Models
GAN vs VAE,本质差别知乎 https://www.zhihu.com/question/317623081/answer/1997177136

此内容受密码保护。如需查阅,请在下列字段中输入您的密码。

https://academic.oup.com/bioinformatics/article/26/22/2897/227791

给定一群统计量的观测值,我们描述一下用FastPval为每个观测值赋予P值得算法。FastPval的P值计算分为两个步骤,并且利用了这些统计量的分布右尾来计算统计量(?)。

在第一个步骤,我们随机地从原始数据集O中采样出N个样本构成一个子集(为提升效率,N通常是O的百分之一的规模)。我们对N排序,并找到一个阈值s_c,使得大于s_c是N的top P portion(N和P都是用户设定的,N默认设置为100,000而P默认设置为0.001)。

得到这个阈值后,我们再扫描数据集O,把大于阈值 s_c的值放到集合M中去,也对M排序,得到M里的最大值s_m。我们把排了序的N和M保存好,作为M1和M2两个model。

那么在第二个步骤,新来一个统计量s时,为计算它的P值,我们先把它和s_c比较:如果s\leq s_c,我们就在M1中计算它的P值,否则就在M2中计算它的P值。如果s\geq s_M,这就意味着s超过了我们的采样范围,我们将使用理论分布来计算它的P值或者简单地将它的值设置为0(取决于用户偏好;如果采用normal distribution或者extreme value distribution的理论分布,其分布参数由N数据集估计)。

 

For simplicity, here we illustrate our method in a two-stage approach and use the right tail of the distribution to calculate the statistics. In the first stage, we randomly sample a subset N from the original large dataset O. N is usually less than one-hundredth of the size of O, thus saving processing time. We sort N and obtain a cutoff score Sc representing the top P portion of N. Both N and P are parameters specified by the users, and are set to N = 100 000 and P = 0.001 by the default. We then scan the original set and put scores greater than Sc into our second subset M, and we obtain the maximum score Sm in M. The two subsets N and M are sorted, saved, and serve as our two models (M1 and M2). To calculate the P-value for a new score S, we compare S with Sc. If S ≤ Sc, we will find its P-value in M1. Otherwise we use M2. If S > Sm, indicating S is out of our resampling score range, we use theoretical distribution to calculate its P-value or simply set the P-value to 0, at the user’s preference. The parameters of two theoretical distributions, normal and extreme value distributions, were obtained from dataset N.