首页
登录 | 注册

下载百度的LRC歌词的Perl脚本

以前做歌词秀Xlyrics的时候,苦于没有那个LRC的歌词站点能够直接搜索歌词的,有的歌词库又太小,前段时间一个不小心发现百度已经支持LRC歌词的在线搜索了,甚是惊喜了一番。
闲暇之余,自己用Perl语言做了一个简单的搜索和下载脚本。此脚本已经支持查找和下载功能了。
系统要求:
当然首先需要装上Perl语言的,其次还需要Perl语言的LWP模块,因为不同系统安装方法不同,请自行解决。
具体使用:
xiaosuo@gentux perl $ ./baidulrc -h
 Usage: baidulrc [options] MusicName.
        -l              get the lyrics list.
        -p num          set the page to n, default 1.
        -n num          set the number to n.
        -d              download the special lyrics.
        -x              output the content in XML format.
        -h              show this help page.
其它说明:
  1. 因为百度的歌词搜索不仅基于歌名,还基于歌词内容,所以可能不符合要求的结果偏多,不过有的情况下,可能变成好事,比如你只知道其中的某句话但想不起来歌名了。
  2. 因为这个脚本可以按照XML的格式输出内容,并且歌词默认是输出到标准输出的,所以它很容易和其它的程序接合,我有时间会把它集成到我的XLyircs当中。
具体代码:
#!/usr/bin/perl
#
# Copyright (C) xiaosuo
# License GPL2 or above
#
use strict;
use warnings;
use URI;
use HTML::TreeBuilder;
require LWP::UserAgent;
use Getopt::Std;

sub get_musics
{
        my $tree = $_[0] || die "No tree";
        my @musics;

        foreach my $div (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('class');
                                return 0 unless($_[0]->attr('class') =~ /BlueBG/);
                        })){
                my ($title) = ($div->as_text =~ /歌曲:(.*)/);
                push @musics, $title;
        }

        return @musics;
}

sub get_actors
{
        my $tree = $_[0] || die "No tree";
        my @actors;

        foreach my $div (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('style');
                                return 0 unless $_[0]->attr('style') =~ /padding-top:10px;padding-left:15px/;
                        })){
                if($div->as_text =~ /歌手:(.*) 专辑:(.*)/){
                        push @actors, $1;
                }elsif($div->as_text =~ /歌手:(.*)/){
                        push @actors, $1;
                }else{
                        push @actors, $div->as_text;
                }
        }

        return @actors;
}

sub get_links
{
        my $tree = $_[0] || die "No tree";
        my @links;

        foreach my $div (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('class');
                                return 0 unless $_[0]->attr('class') =~ /unnamed3/;
                        })){
                if(my ($a) = ($div->look_down('_tag', 'a',
                                sub{
                                        return 0 unless $_[0]->as_text;
                                        return 0 unless $_[0]->as_text =~ /LRC歌词/;
                                }))){
                        push @links, $a->attr('href');
                }else{
                        push @links, '';
                }
        }

        return @links;
}

sub get_page_num
{
        my $tree = $_[0] || die "No tree";
        my $num = 1;

        if(my ($div) = (
                $tree->look_down('_tag', 'div',
                        sub{
                                return 0 unless $_[0]->attr('class');
                                return 0 unless $_[0]->attr('class') =~ /pg/;
                        }))){
                foreach my $a ($div->look_down('_tag', 'a',
                                sub{
                                        return 0 unless $_[0]->as_text;
                                })){
                        if($a->as_text =~ /\[(.*)\]/){
                                $num = int($1);
                        }
                }
        }

        return $num;
}

# Usage: baidu
sub usage
{
        print " Usage: baidulrc [options] MusicName.\n";
        print "         -l              get the lyrics list.\n";
        print "         -p num          set the page to n, default 1.\n";
        print "         -n num          set the number to n.\n";
        print "         -d              download the special lyrics.\n";
        print "         -x              output the content in XML format.\n";
        print "         -h              show this help page.\n";
}

# parse the options.
my %opts;
if(!getopts("lp:n:dxh", \%opts)){
        usage();
        exit(1);
}
if($opts{"h"}){
        usage();
        exit(0);
}
if($#ARGV != 0){
        usage();
        exit(1);
}
my $music_name = $ARGV[0];
my $page_number = 0;
my $music_number = -1;
if($opts{"p"} && $opts{"p"} > 1){
        $page_number = ($opts{"p"} - 1) * 10;
}
if($opts{"n"}){
        if($opts{"n"} < 1 || $opts{"n"} > 10){
                print "The lyrics number must between 0 and 9.\n";
                exit(1);
        }
        $music_number = $opts{"n"} - 1;
}

# get the lyrics list
my $uri = URI->new('http://mp3.baidu.com/m');
$uri->query_form(
#       'f' => 'ms',
        'rn' => '10',
        'tn' => 'baidump3lyric',
        'ct' => '150994944',
        'word' => $music_name,
        'lm' => '-1',
        'z' => '0',
        'cl' => '3',
        'sn' => '',
        'cm' => '1',
        'sc' => '1',
        'bu' => '',
        'pn' => "$page_number"
);
my $ua = LWP::UserAgent->new;
my $response = $ua->get($uri->as_string) || die("get $uri->as_string failed.\n");
my $tree = HTML::TreeBuilder->new;
if($response->is_success){
        $tree->parse_content($response->content) || die("parse failed.\n");
}else{
        die $response->status_line;
}

my @links = get_links $tree;
if($opts{"d"}){ # download the lyrics
        if($music_number < 0){
                print "You must give the music mumber.\n";
                usage();
                exit(1);
        }
        if($links[$music_number] eq ""){
                print "There is no LRC lyrics for this music, try the others.\n";
                exit(1);
        }
        my $fua = LWP::UserAgent->new;
        my $fres = $fua->get($links[$music_number]) || die("get $links[$music_number] failed.\n");
        if(!$fres->is_success){
                print "get $links[$music_number] failed.\n";
                exit(1);
        }
        if($opts{"x"}){
                print "\n";
                print "\n";
        }
        foreach my $line (split("\n", $fres->content)){
                if($line =~ /^\[.*:.*\].*$/){
                        print $line . "\n";
                }
        }
        if($opts{"x"}){
                print "
\n";
        }
        exit 0;
}

my @musics = get_musics $tree;
my @actors = get_actors $tree;
my $num = get_page_num $tree;
# output the list
if($opts{"x"}){
        print "\n";
        print "\n";
        for(my $i = 0; $i <= $#musics; $i ++){
                print "\n";
                print "$musics[$i]\n";
                print "$actors[$i]\n";
                print "$links[$i]\n";
                print "
\n";
        }
        print "$num\n";
        print "
\n";
}else{
        for(my $i = 0; $i <= $#musics; $i ++){
                print $i + 1 . "\t $musics[$i] - $actors[$i]\n"
        }
        print "Total page number: $num.\n";
}

# free the memory and exit
$tree->delete;
exit(0);


相关文章

  • 用 Net::SSH::Perl 实现不用建信任关系 ssh连接操作
        平时,要自动ssh到远程主机上做操作时,都要建立信任关系,这样在连接时,才不用输入密码. 就想不用建立信任关系的情况下,来自动ssh连接到远程主机做操作.可以用提供一个所要连接的远程主机列表来实现这个功能.列表里提供了主机IP,用户 ...
  • 网站漏洞修复之UEditor漏洞 任意文件上传漏洞 2018 .net新
    那么UEdito漏洞到底是如何产生的呢?最主要的还是利用了IIS的目录解压功能,在解压的同时会去访问控制器文件,包括controller.aspx文件,当上传到网站里的时候,会自动解压并调用一些特殊应用的目录地址,有些目录都可以被远程的调用 ...
  • 【原创】利用inotify+rsync实现linux文件批量更新
    原创作品,允许转载,转载时请务必以超链接形式标明文章 原始出处 .作者信息和本声明.否则将追究法律责任.http://blog.chinaunix.net/space.php?uid=9419692&do=blog&id=3 ...
  •     让新网站如何快速收录的方法教程较为简单,实施较为复杂.简单之处在于做好内容与外链(推荐友情链接),做好基础的url推送以及站内的优化设置即可.复杂在于几句简单的话语,在实施的时候困难不小,新站不易做友链,高质量内容也较难组织.很多站 ...
  • 拼写检查错误提示是搜索引擎都具备的一个功能,也就是说用户提交查询给搜索引擎,搜索引擎检查看是否用户输入的拼写有错误,对于中文用户来说一般造成的错误是输入法造成的错误.那么我们就来分析看看百度是怎么实现这一功能的. 我们分析拼写检查系统关注以 ...
  • 什么是公网IP、内网IP和NAT转换?
    1.引言 2.每台电脑都必须要一个公网IP吗? 我们都知道,IPv4中的IP地址的数量是有限的(所以现在都在搞IPv6嘛),每次把一部分地址分配出去,那么就意味着能够用来分配的IP地址就更少了,而且随着现在手机,电脑等的快速发展,如果每个手 ...

2020 unjeep.com webmaster#unjeep.com
12 q. 0.013 s.
京ICP备10005923号