Why wouldn't it be able to? If you have a funny accent that VA gets wrong, then you just need to type the command that VA reports. There's no point programming in the Queen's English if you have an thick Geordie accent. I don't understand why people have such trouble with this.
Anyway, the pixel coordinate trick is useful in various non obvious ways, but my point is that VA can emulate a mouse click.