Named Entity Recognition in Perl
[ Perl ]
by jaganadhg
@ 08.07.2009 17:40 GMT
Named Entity Recognition is another trivial issue in Natural Language Processing. For details about named entities refer the wiki article on Named entity recognition.
Usually a named entity refers name of person, companies etc... It may be a group of word optionally joined with of the and like(In English). Automatic identification of this groups are a problematic one.
Wikipedia lists a handful of tools for this purpose. I think NLTK(Natural Language Tool Kit ) also have some module for it. Here I am going to say how to do it with perl.
There is perl module called Lingua::EN::NamedEntity written by 'Simon Cozens' author of "Advanced Perl Programming". It is written for Named Entity Recognition.
To install the module in GNU/Linux system follow the below given steps.
Open terminal. Type 'cpan' as root user. Type 'install Lingua::EN::NamedEnity' and follow the instructions.
Here is the code to extract named entity with the module.
====code begin =========================
#!/usr/bin/env perl
use Lingua::EN::NamedEntity;
while (<>) {
my $str = join '\n',<>;
my @entities = extract_entities($str);
foreach my $entity (@entities) {
print $entity->{entity},"\n";
exit;
}
===code end ============================
I think the code is self explanatory.(If not please leave a comment)
You may feel it as a slow code some times. I don't know why !!!!
My sincere thanks goes to "Simon Cozens" and his book, my old PGDL students. I think I got the gyan from Simons book and cpan.org.
Some body really wants to see how it can be implemented for Indian Languages.
Happy Hacking!!!!!!