Language detection

Forums:

Hi there, at the old forum you answered a question on this topic with the code below, so my question is, where do I get the analyzer.php file, or.... can you share the code of your demo version....
I tried this code already unsuccessfully though..

Thanks!

<code>
include("analyzer.php");

// Adjust this path to your local FreeLing installation
$FL_DIR = "/usr/local";

// launch an identifier
$ident = new analyzer("11111","--outf ident --fidn $FL_DIR/share/freeling/common/lang_ident/ident.dat","$FL_DIR/bin","$FL_DIR/share/freeling");

/// launch a spanish analyzer
$spa = new analyzer("22222","-f $FL_DIR/share/freeling/config/es.cfg","$FL_DIR/bin","$FL_DIR/share/freeling");

/// launch an english analyzer
$eng = new analyzer("33333","-f $FL_DIR/share/freeling/config/en.cfg","$FL_DIR/bin","$FL_DIR/share/freeling");

// find out language
$lang = $ident->analyze_text($mytext);

// analyze text with appropriate analyzer
if ($lang=="es")
$result = $spa->analyze_text($mytext);
else if ($lang=="en")
$result = $eng->analyze_text($mytext);
else
print "can't analyze text in $lang language";

print $result
</code>

The code for the APIs is in the source tarball, or in GitHub

Check the folder APIs/php in FreeLing sources.

andres's picture

Thanks a lot, two things, 1) how can I activate receiving notice of answers to this thread. 2) I get the error: Connection refused
I have adapted my path:
$FL_DIR = "/usr";
because: /usr/share/freeling/config/es.cfg
is it right this way?

1) how can I activate receiving notice of answers to this thread.
I am afraid drupal is rather limited in this sense...

2) I get the error: Connection refused
That looks like some socket error.. make sure the client is accessing the right port. You can also try to do this out of PHP to make sure it works.
Notice that PHP API is rather old (and it is more a hack than an API). It may need some updating.

I have adapted my path: $FL_DIR = "/usr"; is it right this way?
yes, it is

andres's picture

Thanks..
1) There is a bunch of modules that can extend core forum module in drupal 7....

2) I could make run the exemples given at the API/php but no success with the language detection, output is empty. Here is the code I'm using (I discovered that making the port same repeated numbers will fail (11111), but consecutive numbers will work (12345)).

<code>
<?php
include("libs/analyzer.php");

$mytext = 'Ahora no sabría dibujar, ni siquiera hacer una línea con el lápiz; y, sin embargo, jamás he sido mejor pintor. Cuando el valle se vela en torno mío con un encaje de vapores";

// Adjust this path to your local FreeLing installation
$FL_DIR = "/usr";

// launch an identifier
$ident = new analyzer("12345","--outf ident --fidn $FL_DIR/share/freeling/common/lang_ident/ident.dat","$FL_DIR/bin","$FL_DIR/share/freeling");

/// launch a spanish analyzer
$spa = new analyzer("23456","-f $FL_DIR/share/freeling/config/es.cfg","$FL_DIR/bin","$FL_DIR/share/freeling");

/// launch an english analyzer
$eng = new analyzer("34567","-f $FL_DIR/share/freeling/config/en.cfg","$FL_DIR/bin","$FL_DIR/share/freeling");

// find out language
$lang = $ident->analyze_text($mytext);

print $lang;
echo $lang;

?>
</code>

andres's picture

Ok, changing and adding some code it make it work... if somebody else need something similar, here it is (it is a little bit slow though):
Thanks!

<code>
<?php
include("analyzer.php");

$mytext = 'Ahora no sabría dibujar, ni siquiera hacer una línea con el lápiz;';
// y, sin embargo, jamás he sido mejor pintor Cuando el valle se vela en torno mío con un encaje de vapores; cuando el sol de mediodía centellea sobre la impenetrable sombra de mi bosque sin conseguir otra cosa$

// Adjust this path to your local FreeLing installation
$FL_DIR = "/usr";

// launch an identifier
$ident = new analyzer("12345","--outf ident --fidn $FL_DIR/share/freeling/common/lang_ident/ident.dat","$FL_DIR/bin","$FL_DIR/share/freeling");

// launch a spanish analyzer
$spa = new analyzer("23456","-f $FL_DIR/share/freeling/config/es.cfg","$FL_DIR/bin","$FL_DIR/share/freeling");

// launch an english analyzer
$eng = new analyzer("34567","-f $FL_DIR/share/freeling/config/en.cfg","$FL_DIR/bin","$FL_DIR/share/freeling");

// find out language
$lang = $ident->analyze_text($mytext);

if($lang = "es") {
echo "ES";
} else if($lang = "en") {
echo "EN";
} else {
print "can't analyze text in $lang language";
}
?>
</code>

andres's picture

Sorry, I spoke too fast, I turned around the printing code, and instead of echoing the right language, it just echoes the first echo it finds...

// print the result
if($lang = "en") {
echo "English";
}
else if($lang = "es") {
echo "Spanish";
} else {
print "can't analyze text in $lang language";
}

I'm using a Spanish text, instead it prints out "English"

The php API is actually a hack that launches an "analyzer" in server mode, and then sends requests to it via a socket.

Also, the code you are using has some options that look from older freeling versions.

Thus, you should first make sure that the setting works outside PHP.
That is:
- Make sure the options you use are valid for you freeling version, using "analyze" in interactive command line mode.
- launch "analyze" in server mode with the selected options
- use "analyzer_client" to send requests to the server and get the response.

If all that works, next step is sending requests to the same server (launched outside php) from a small php program. See "sample1.php" and "sample2.php" in folder APIs/php

If that works also, then you can try launching the server from inside php.
(analyzer.php constructor will look for an existing server or launch a new one depending on the parameters. See example "sample1.php")

Finally, note that this is a rather unstable and insecure way of using freeling.
If you use it in a production environment, do it at your own risk.

The ideal solution would be using SWIG to create an actual PHP API.
SWIG 3.0 should support that.
If you manage to do that, contributions are welcome ;-)