zend_validate_regex not happy with accented characters

so I'm using Zend_Framework and I need to validate a text and accept not only digits and normal letters, but also some guys like 'ã', 'ç' and so on...

I was confident that a simple regex validation would do the job:

    public function SetTitle($title) 
      $validator = new Zend_Validate_Regex('/^[0-9a-zA-ZÀ-ú]+[0-9A-Za-zÀ-ú\'\-\.:,; ]{1,50}$/');

      if ($validator->isValid($title)) {
        if ($this->title != $title) {
            $this->title = $title;
      } else {
        throw new MyApp_Projects_ProjectException("This ($title) is not a valid title.");

} //SetTitle

and it really worked when, after some thinking reported below, I test something like this:

public function testIfCanAttributeTitleToProject()
    $someTitle = "some title with ç, á and ã";
    $this->assertEquals($this->project->getTitle(), $someTitle);

But, when I try to add a validator to check data at the form, like this:

    $title = new Zend_Form_Element_Text('title');
        ->setOptions(array('size' => '50'))
        ->addValidator('Regex', false, array(
            'pattern' => "/^[0-9a-zA-ZÀ-ú]+[0-9A-Za-zÀ-ú\'\-\.,: ]{1,50}$/"
    // attach elements to form

a error is raised when I try to test

public function testUserCanUseAccentedCharacters() {

   $form = new MyApp_Form_ProjectCreate();
   $formData = array(
       'title' => 'we scream to weird chars like ã é or ç',
       'submit' => true

where process function is like:

public function process($data) 
    if ($this->isValid($data) !== true) {
        throw new MyApp_Form_ProjectCreateException('Invalid data!');
    } else {
        $db = Zend_Registry::get('db');
        $projectMapper = new MyApp_Projects_ProjectMapper($db);        
        $project = new MyApp_Projects_Project();

I have already checked and retested the regular expression in other contexts and it seems OK, but for some reason, even as Zend_Validate itself works with this expression, a validator inside a form element doesn't accept anything inside the À-ú range...

For sure I'm (still) losing something basic here... or banging my head against a wall when I have some better way around...

Do someone may help me, please?

TIA, again... :)


  • '/^[0-9a-zA-ZÀ-ú]+[0-9A-Za-zÀ-ú\'\-\. ]{1,50}$/'

    embeds a single quote. Will this work for you?

    "/^[0-9a-zA-ZÀ-ú]+[0-9A-Za-zÀ-ú\'\-\. ]{1,50}$/"


    Three more things to try. I don't know the details of Zend's implementation of regular expressions, so I don't know whether the first two will work.

    The Unicode Letter property:

    "/^([0-9]\p{Letter})+([0-9\'\-\. ]\p{Letter}){1,50}$/"

    The Posix character class:

    "/^([0-9][[:alpha:]])+([0-9\'\-\. ][[:alpha:]]){1,50}$/"

    Brute force enumeration of the letters you care about:

    "/^[0-9a-zA-ZÀÁÂ cetera... øùú]+[0-9A-Za-zÀÁÂ cetera... øùú\'\-\. ]{1,50}$/"