Sijie’s Blog

Rstudio-server and shiny-server deployment

20 12 月, 2021 in Uncategorized | No comments

Rstudio-server deployment

After installing the two server with super user previledges, we can’t directly run the servers because the path to R has not been set.

The configuration file of rstudio-server lies at /etc/rstudio/rserver.conf
We shall append the following line:

rsession-which-r=/home/csj/anaconda3/envs/r411py37/bin/R

More options, such as port, external lib paths, could also be set in the rserver.conf file (see details at https://support.rstudio.com/hc/en-us/articles/200552316-Configuring-RStudio-Workbench-RStudio-Server)

Run the following command to verify if rstudio-server was configured successfully.

sudo rstudio-server verify-installation

we can manage the rstudio-server with the following commands
sudo rstudio-server status sudo rstudio-server start sudo rstudio-server stop sudo rstudio-server restart

The default port of rstudio-server is 8787.

Shiny-server deployment

run the following code first before starting the server
export R=/home/csj/anaconda3/envs/r411py37/bin

the path to R shall be configured at /etc/shiny-server/shiny-server.conf

we can change the configurations here
# Define a server that listens on port 3838 server { listen 3838;


# Define a location at the base URL

location / {
# Host the directory of Shiny Apps stored in this directory

site_dir /srv/shiny-server;
# Log all Shiny output to files in this directory

log_dir /var/log/shiny-server;

# When a user visits the base URL rather than a particular application, # an index of the applications available in this directory will be shown. directory_index on; } }

start the server with
sudo systemctl start shiny-server

frp内网穿透

6 11 月, 2021 in Uncategorized | No comments

https://nymrli.top/2019/02/24/%E6%90%AD%E5%BB%BAfrp%E6%9C%8D%E5%8A%A1-%E9%98%BF%E9%87%8C%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8/

vi frps.ini

[common]
bind_port = 7000
vhost_http_port = 8080

[common]部分是必须有的配置，其中bind_port是自己设定的frp服务端端口，vhost_http_port是自己设定的http访问端口。

./frps -c ./frps.ini

——————————————————————————–
vi frpc.ini

[common]
server_addr = x.x.x.x
server_port = 7000

[ssh]
type = tcp
local_ip = 127.0.0.1
local_port = 22
remote_port = 6000

[nas]
type = http
local_port = 5000
custom_domains = no1.sunnyrx.com

[web]
type = http
local_port = 80
custom_domains = no2.sunnyrx.com

上面的配置和服务端是对应的。

[common]中的server_addr填frp服务端的ip（也就是外网主机的IP），server_port填frp服务端的bind_prot。

[ssh]中的local_port填群晖的ssh端口。

[nas]中的type对应服务端配置。local_port填群晖的DSM端口。custom_domains为要映射的域名，记得域名的A记录要解析到外网主机的IP。

[web]同上，local_port填群晖的web端口。这里创建了两个http反向代理是为了分别映射群晖两个重要的端口，5000和80，前者用于登录群晖管理，后者用于群晖的Web Station和DS Photo。

保存配置，输入以下指令运行frp客户端。（同样如果需要在后台运行，请往下翻阅关于后台运行的部分。）

./frpc -c ./frpc.ini

Bulletin

29 7 月, 2021 in Uncategorized | No comments

向量数据库
Milvus https://github.com/milvus-io/milvus
Milvus向量数据库知乎介绍 https://zhuanlan.zhihu.com/p/393699963
向量数据库知乎介绍 https://zhuanlan.zhihu.com/p/40487710

Single-cell
只用单个细胞的信息做分类CellID https://zhuanlan.zhihu.com/p/392992024
Next Generation Genomics https://underline.io/events/165/reception
Clustered DotPlot https://divingintogeneticsandgenomics.rbind.io/post/clustered-dotplot-for-single-cell-rnaseq/
Clustered DotPlot https://davemcg.github.io/post/lets-plot-scrna-dotplots/
增强的单细胞绘图，支持Seurat https://bioconductor.org/packages/release/bioc/vignettes/dittoSeq/inst/doc/dittoSeq.html#561_dittoDotPlot
漂亮的DotPlot https://www.biostars.org/p/484150/
HieRFIT: A hierarchical cell type classification tool https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/btab499/6320801?redirectedFrom=fulltext
Subpopulation mapping https://www.researchgate.net/publication/329421271_Accurate_sub-population_detection_and_mapping_across_single_cell_experiments_with_PopCorn
Iterative clustering https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8009055/
Challenges in single-cell clustering https://www.nature.com/articles/s41576-018-0088-9#Sec2

Generative Models
GAN vs VAE，本质差别知乎 https://www.zhihu.com/question/317623081/answer/1997177136

密码保护：Analysis Platform

19 1 月, 2021 in Uncategorized | No comments

FastPval explained

16 12 月, 2020 in Uncategorized | No comments

https://academic.oup.com/bioinformatics/article/26/22/2897/227791

给定一群统计量的观测值，我们描述一下用FastPval为每个观测值赋予P值得算法。FastPval的P值计算分为两个步骤，并且利用了这些统计量的分布右尾来计算统计量（？）。

在第一个步骤，我们随机地从原始数据集O中采样出N个样本构成一个子集（为提升效率，N通常是O的百分之一的规模）。我们对N排序，并找到一个阈值 $s_c$ ，使得大于 $s_c$ 是N的top P portion（N和P都是用户设定的，N默认设置为100,000而P默认设置为0.001）。

得到这个阈值后，我们再扫描数据集O，把大于阈值 $s_c$ 的值放到集合 $M$ 中去，也对M排序，得到 $M$ 里的最大值 $s_m$ 。我们把排了序的N和M保存好，作为M1和M2两个model。

那么在第二个步骤，新来一个统计量s时，为计算它的P值，我们先把它和 $s_c$ 比较：如果 $s\leq s_c$ ，我们就在M1中计算它的P值，否则就在M2中计算它的P值。如果 $s\geq s_M$ ，这就意味着s超过了我们的采样范围，我们将使用理论分布来计算它的P值或者简单地将它的值设置为0（取决于用户偏好；如果采用normal distribution或者extreme value distribution的理论分布，其分布参数由N数据集估计）。

For simplicity, here we illustrate our method in a two-stage approach and use the right tail of the distribution to calculate the statistics. In the first stage, we randomly sample a subset N from the original large dataset O. N is usually less than one-hundredth of the size of O, thus saving processing time. We sort N and obtain a cutoff score S_c representing the top P portion of N. Both N and P are parameters specified by the users, and are set to N = 100 000 and P = 0.001 by the default. We then scan the original set and put scores greater than S_c into our second subset M, and we obtain the maximum score S_m in M. The two subsets N and M are sorted, saved, and serve as our two models (M1 and M2). To calculate the P-value for a new score S, we compare S with S_c. If S ≤ S_c, we will find its P-value in M1. Otherwise we use M2. If S > S_m, indicating S is out of our resampling score range, we use theoretical distribution to calculate its P-value or simply set the P-value to 0, at the user’s preference. The parameters of two theoretical distributions, normal and extreme value distributions, were obtained from dataset N.

Uncategorized