尤川豪   ·  5年前
445 貼文  ·  275 留言

PHP 計算中文與英文字數

$encoding = mb_detect_encoding($text);

$result = array(
    'count_cn' => 0,
    'count_en' => 0,
);

$text_cn  = preg_replace("/[^\p{Han}\?\!\;\.\〜\ー\。\,\「\」\《\、\》\【\】\『\』\:\(\)\(\)\/\・]/u","", $text);

$result['count_cn'] =  mb_strlen($text_cn, $encoding);

$text_en  = preg_replace("/[\'\"]/","", $text);

$text_en  = preg_replace("/[^a-zA-Z\s]/"," ", $text_en);

$result['count_en'] = str_word_count($text_en);
  分享   共 1,907 次點閱
按了喜歡:
共有 1 則留言
尤川豪   ·  5年前
445 貼文  ·  275 留言

多語的版本:

$encoding = mb_detect_encoding($text);

$result = array(
    'count_cn' => 0,
    'count_en' => 0,
    'count_jp' => 0,
    'count_es' => 0,
    'count_id' => 0,
    'count_ko' => 0,
);

$text_cn  = preg_replace("/[^\p{Han}\?\!\;\.\〜\ー\。\,\「\」\《\、\》\【\】\『\』\:\(\)\(\)\/\・]/u","", $text);

$result['count_cn'] =  mb_strlen($text_cn, $encoding);

$text_en  = preg_replace("/[\'\"]/","", $text);

$text_en  = preg_replace("/[^a-zA-Z\s]/"," ", $text_en);

$result['count_en'] = str_word_count($text_en);

$text_cyrillic  = preg_replace("/[^\p{Cyrillic}\s]/","", $text);
$result['count_ru'] = str_word_count($text_cyrillic);

//$pattern_jp = "[^\p{Hiragana}\p{Katakana}\]";
$text_jp  = preg_replace("/[^\p{Han}\p{Hiragana}\p{Katakana}\.\〜\ー\。\,\「\」\《\、\》\【\】\『\』\:\(\)\(\)\/\・]/u","", $text );
//$this->count_jp = mb_strlen($text_jp, $encoding);
$result['count_jp'] =  mb_strlen($text_jp, $encoding);

$text_ko  = preg_replace("/[^\p{Han}\p{Hangul}\?\!\;\.\〜\ー\。\,\「\」\《\、\》\【\】\『\』\:\(\)\(\)\/\・]/u","", $text);
$result['count_ko'] =  mb_strlen($text_ko, $encoding);

$text_ru  = preg_replace("/[^\x{0430}-\x{044F}\x{0410}-\x{042F}\s]/u"," ", $text);

$result['count_ru'] = count(preg_split('/\s+/', $text_ru));

$result['count_ru_literra'] = $result['count_ru'];

$result['count_es'] = str_word_count($text_en);

$result['count_id'] = str_word_count($text_en);

return $result;
 
按了喜歡:
您的留言
尤川豪
445 貼文  ·  275 留言

Devs.tw 是讓工程師寫筆記、網誌的平台。隨手紀錄、寫作,方便日後搜尋!

歡迎您一起加入寫作與分享的行列!

查看所有文章