From ilya Tue Nov  4 20:51:32 1997
Subject: Re: cperl-mode and emacs-20.2
To: rms@gnu.org
Date: Tue, 4 Nov 1997 20:51:32 -0500 (EST)
In-Reply-To: <199711041747.KAA10864@chaco.santafe.edu> from "Richard Stallman" at Nov 04, 1997 10:47:04 AM
X-Mailer: ELM [version 2.5 PL0b1]
Content-Length: 1518      
Status: O

Richard Stallman writes:
> 
>     The patch below is a "cosmetic" workaround.  It is "bad" since it
>     changes semantic of re-match in a restricted buffer, so that what is
>     after the end of restricted region may affect the match (like removing
>     word-boundary at the end of restricted region if the region crosses a
>     word).  However, it eliminates the above bug.
> 
> I would rather install a real, correct fix--but I don't really
> understand what the problem is.  Could you send me a more complete
> explanation of how this problem develops?

"Correct" is not applicable here, since it is not documented how
re-search's lookahead works with restriction.  Suppose you have a
buffer containing 
       blah
and restrict it so it contains "la" instead.  Should \> match after
"a"?  I *think* that the old version would match it, but the patched
one would not.

Adding to the exlanation in my original message:  You asked me to make
\s look for syntax-table property.  To make this lookup, one needs to
know where the properties for the given object are stored
(communicated via a global), and offset w.r.t. the start of the
object.

My code was supposing that the string supplied to re-functions is
based at the start of the object.  However, it so happens that in a
restricted buffer the (char*) is based on the start of restricted
region, which breaks the code. 

Two possible solution might be to communicate offset via another
global variable, or somehow deduce it of from the buffer object.

Ilya


From ilya Wed Oct 29 19:10:03 1997
Subject: Re: cperl-mode and emacs-20.2
To: dsadinof@olf.com (Danny Sadinoff)
Date: Wed, 29 Oct 1997 19:10:03 -0500 (EST)
Cc: rms@gnu.ai.mit.edu (Richard Stallman)
In-Reply-To: <3457A474.2DF3F79C@olf.com> from "Danny Sadinoff" at Oct 29, 1997 04:02:44 PM
X-Mailer: ELM [version 2.5 PL0b1]
Content-Length: 5794      
Status: OR

Danny Sadinoff writes:
> Okay.  on solaris2.5.1, I create a file called 'thefile', whose sole contents are
> two lines like so:
> #!/usr/bin/perl
> hi there
> 
> Then I run emacs like so:
> emacs-20.2 -q --no-site-file
> 
> I load cperl-mode manually,  then fetch the file.
> 
> When I try to run "M-x comment-region"on the second line of the file I get an error.

Almost missed this line...

I know this bug.  I put a workaround in my copy of 19.33, but did not
report it since it was so obvious to find that having no bug reports
showed that it should have been fixed in 20.+.

Looks like it was not, *and* typical expectation of Emacs user is
*very* low.

The short description is "making RE-search after restriction fails if
parse-sexp-lookup-properties is set".  The longer description is 

   re-match-2 takes arguments which are substrings which correspond to
   the restricted buffer.  However, to find the properties, it needs
   to know where are these substrings wrt to start-of-buffer/string.

The patch below is a "cosmetic" workaround.  It is "bad" since it
changes semantic of re-match in a restricted buffer, so that what is
after the end of restricted region may affect the match (like removing
word-boundary at the end of restricted region if the region crosses a
word).  However, it eliminates the above bug.

Enjoy,
Ilya

*** textprop.c~	Fri Mar 14 17:59:24 1997
--- textprop.c	Mon Jul 28 00:41:46 1997
*************** interval_of (position, object)
*** 539,545 ****
        i = s->intervals;
      }
  
!   if (!(beg <= position && position <= end))
      args_out_of_range (position, position);
    if (beg == end || NULL_INTERVAL_P (i))
      return NULL_INTERVAL;
--- 539,546 ----
        i = s->intervals;
      }
  
!   if (!(beg <= position && position <= end) 
!       && !(BUFFERP (object) && (position == beg - 1 && beg > 1))) /* lookbehind */
      args_out_of_range (position, position);
    if (beg == end || NULL_INTERVAL_P (i))
      return NULL_INTERVAL;
*** search.c~	Tue Mar 11 00:15:56 1997
--- search.c	Mon Jul 28 00:51:00 1997
*************** looking_at_1 (string, posix)
*** 223,249 ****
    /* Get pointers and sizes of the two strings
       that make up the visible portion of the buffer. */
  
!   p1 = BEGV_ADDR;
!   s1 = GPT - BEGV;
    p2 = GAP_END_ADDR;
    s2 = ZV - GPT;
    if (s1 < 0)
      {
        p2 = p1;
!       s2 = ZV - BEGV;
        s1 = 0;
      }
    if (s2 < 0)
      {
!       s1 = ZV - BEGV;
        s2 = 0;
      }
  
    re_match_object = Qnil;
    
    i = re_match_2 (bufp, (char *) p1, s1, (char *) p2, s2,
! 		  PT - BEGV, &search_regs,
! 		  ZV - BEGV);
    if (i == -2)
      matcher_overflow ();
  
--- 223,249 ----
    /* Get pointers and sizes of the two strings
       that make up the visible portion of the buffer. */
  
!   p1 = BEG_ADDR;
!   s1 = GPT - 1;
    p2 = GAP_END_ADDR;
    s2 = ZV - GPT;
    if (s1 < 0)
      {
        p2 = p1;
!       s2 = ZV - 1;
        s1 = 0;
      }
    if (s2 < 0)
      {
!       s1 = ZV - 1;
        s2 = 0;
      }
  
    re_match_object = Qnil;
    
    i = re_match_2 (bufp, (char *) p1, s1, (char *) p2, s2,
! 		  PT - 1, &search_regs,
! 		  ZV - 1);
    if (i == -2)
      matcher_overflow ();
  
*************** search_buffer (string, pos, lim, n, RE, 
*** 808,826 ****
        /* Get pointers and sizes of the two strings
  	 that make up the visible portion of the buffer. */
  
!       p1 = BEGV_ADDR;
!       s1 = GPT - BEGV;
        p2 = GAP_END_ADDR;
        s2 = ZV - GPT;
        if (s1 < 0)
  	{
  	  p2 = p1;
! 	  s2 = ZV - BEGV;
  	  s1 = 0;
  	}
        if (s2 < 0)
  	{
! 	  s1 = ZV - BEGV;
  	  s2 = 0;
  	}
        re_match_object = Qnil;
--- 808,826 ----
        /* Get pointers and sizes of the two strings
  	 that make up the visible portion of the buffer. */
  
!       p1 = BEG_ADDR;
!       s1 = GPT - 1;
        p2 = GAP_END_ADDR;
        s2 = ZV - GPT;
        if (s1 < 0)
  	{
  	  p2 = p1;
! 	  s2 = ZV - 1;
  	  s1 = 0;
  	}
        if (s2 < 0)
  	{
! 	  s1 = ZV - 1;
  	  s2 = 0;
  	}
        re_match_object = Qnil;
*************** search_buffer (string, pos, lim, n, RE, 
*** 829,844 ****
  	{
  	  int val;
  	  val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2,
! 			     pos - BEGV, lim - pos, &search_regs,
  			     /* Don't allow match past current point */
! 			     pos - BEGV);
  	  if (val == -2)
  	    {
  	      matcher_overflow ();
  	    }
  	  if (val >= 0)
  	    {
! 	      j = BEGV;
  	      for (i = 0; i < search_regs.num_regs; i++)
  		if (search_regs.start[i] >= 0)
  		  {
--- 829,844 ----
  	{
  	  int val;
  	  val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2,
! 			     pos - 1, lim - pos, &search_regs,
  			     /* Don't allow match past current point */
! 			     pos - 1);
  	  if (val == -2)
  	    {
  	      matcher_overflow ();
  	    }
  	  if (val >= 0)
  	    {
! 	      j = 1;
  	      for (i = 0; i < search_regs.num_regs; i++)
  		if (search_regs.start[i] >= 0)
  		  {
*************** search_buffer (string, pos, lim, n, RE, 
*** 860,874 ****
  	{
  	  int val;
  	  val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2,
! 			     pos - BEGV, lim - pos, &search_regs,
! 			     lim - BEGV);
  	  if (val == -2)
  	    {
  	      matcher_overflow ();
  	    }
  	  if (val >= 0)
  	    {
! 	      j = BEGV;
  	      for (i = 0; i < search_regs.num_regs; i++)
  		if (search_regs.start[i] >= 0)
  		  {
--- 860,874 ----
  	{
  	  int val;
  	  val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2,
! 			     pos - 1, lim - pos, &search_regs,
! 			     lim - 1);
  	  if (val == -2)
  	    {
  	      matcher_overflow ();
  	    }
  	  if (val >= 0)
  	    {
! 	      j = 1;
  	      for (i = 0; i < search_regs.num_regs; i++)
  		if (search_regs.start[i] >= 0)
  		  {