Study of Finite State Machines as Language Recognizer

Year : 2023 | Volume :01 | Issue : 01 | Page : 33-37
By

Sarita Tiwari

Kavita Tiwari

Kirti Verma

Ayush Wardhan Sahu

  1. Student Department of Computer Science Engineering, Lakshmi Narain College of Technology, Jabalpur Madhya Pradesh India
  2. Student Department of Computer Science Engineering, Lakshmi Narain College of Technology, Jabalpur Madhya Pradesh India
  3. Dean and HOD Department of Engineering Mathematics, Lakshmi Narain College of Technology, Jabalpur Madhya Pradesh India
  4. Assistant Professor Department of Computer Science Engineering, Lakshmi Narain College of Technology, Jabalpur Madhya Pradesh India

Abstract

Finite State Machines (FSMs) play a fundamental role in computer science and linguistics as language recognizers. This study presents an exploration of the principles and applications of FSMs as efficient tools for recognizing formal languages. The study delves into the theoretical foundations of FSMs and their practical implementation in various language recognition tasks. The fundamental ideas of FSMs, including as states, transitions, and input symbols, are introduced in this study. A clear understanding of these components is essential in constructing FSMs capable of recognizing specific languages. Additionally, the fundamental types of FSMs, namely deterministic and nondeterministic, are discussed, with a focus on their unique properties and computational differences. It examines the theoretical power of FSMs in classifying languages. We analyse the Chomsky hierarchy, which categorizes languages into four classes based on their generative power: regular, context-free, context-sensitive, and recursively enumerable. We illustrate how FSMs effectively identify regular languages, the simplest class in the hierarchy, and the limits of their expressive capacity for more complex languages. The study investigates the application of FSMs in various language recognition scenarios. We demonstrate how FSMs can be employed in lexical analysis for programming languages, enabling efficient tokenization and syntax parsing. In order to demonstrate their adaptability in many linguistic situations, we also study their function in natural language processing, such as part-of-speech tagging and named entity recognition. We examine the difficulties and restrictions faced by FSMs when used as language recognizers. While FSMs are powerful tools for regular languages, they face constraints when dealing with context-dependent phenomena present in more complex languages. We discuss possible extensions and hybrid approaches to overcome these limitations. This study presents case studies and practical implementations of FSMs in real-world language recognition systems. We highlight success stories in automating various language-related tasks, underscoring the effectiveness and efficiency of FSM-based approaches.

Keywords: Finite State Machines (FSMs), regular languages, recursively enumerable, Moore state machine, Mealy State Machine

[This article belongs to International Journal of Data Structure Studies(ijdss)]

How to cite this article: Sarita Tiwari, Kavita Tiwari, Kirti Verma, Ayush Wardhan Sahu. Study of Finite State Machines as Language Recognizer. International Journal of Data Structure Studies. 2023; 01(01):33-37.
How to cite this URL: Sarita Tiwari, Kavita Tiwari, Kirti Verma, Ayush Wardhan Sahu. Study of Finite State Machines as Language Recognizer. International Journal of Data Structure Studies. 2023; 01(01):33-37. Available from: https://journals.stmjournals.com/ijdss/article=2023/view=117400


Browse Figures

References

1. Christopher DM, Hinrich S. Foundations of statistical natural language processing. Cambridge, MA: MIT Press; 1999.
2. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943 Dec; 5: 115–33.
3. Chapter 8. CSCI 340: Computational Models Finite Automata with Output. [Online]. Available from: https://assets.ctfassets.net/kdr3qnns3kvk/2I3MeF7aNeskN5isOdd6wp/9d2eb2cb5b5e622f0 7bc5a4d09c07130/08-Finite-Automata-with-Output.pdf
4. Jurafsky D, Martin JH. Speech and Language Processing. 2nd Edn. Prentice Hall; 2008.
5. Hebisch U, Weinert HJ. Semirings: algebraic theory and applications in computer science. Singapore: World Scientific; 1998.
6. Golan JS. Semirings and their applications. Dordrecht-Boston-London: Kluwer Academic Publishers; 1999.
7. Yu S. Regular languages. Berlin Heidelberg: Springer; 1997; 41–110.
8. Schützenberger MP. On the definition of a family of automata. Inf Control. 1961 Sep 1; 4(2–3): 245–70.
9. Droste M, Rahonis G. Weighted automata and weighted logics with discounting. Theor Comput Sci. 2009 Sep 1; 410(37): 3481–94.
10. Knight K, May J. Applications of weighted automata in natural language processing. In: Handbook of Weighted Automata. 2009 Sep 16; 571–596. Berlin, Heidelberg: Springer.
11. Sakarovitch J. Rational and recognisable power series. In: Handbook of Weighted Automata. 2009 Sep 16; 105–174. Berlin, Heidelberg: Springer.
12. Wirth N, Informaticien S, et al. Compiler construction. Reading: Addison-Wesley; 1996 Oct 1.
13. Xue N. Chinese word segmentation as character tagging. International Journal of Computational Linguistics & Chinese Language Processing (IJCLCLP); Special Issue on Word Formation and Chinese Language Processing. 2003 Feb; 8(1): 29–48.
14. Chen Z, Rau PL, Ma L. How to Develop a User-Friendly Chinese Hand Input System for the Touch Device? A Case Study. In Cross-Cultural Design: 8th International Conference, CCD 2016, Held as Part of HCI International 2016, Toronto, ON, Canada. 2016 Jul 17–22 ; 26–33. Springer International Publishing.
15. Xia F. The segmentation guidelines for the Penn Chinese treebank (3.0). Tech Rep IRCS 00-06. University of Pennsylvania; 2000.
16. Shah S, Ghomeshi H, Vakaj E, Cooper E, Fouad S. A review of natural language processing in contact centre automation. Pattern Anal Appl. 2023 Jun 29; 26(3): 1–24.
17. Nourian M, Wu H, Becchi M. A compiler framework for fixed-topology non-deterministic finite automata on simd platforms. In 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS). 2018 Dec 11; 507–516.
18. Xue N, Xia F, Chiou FD, Palmer M. The penn Chinese treebank: Phrase structure annotation of a large corpus. Nat Lang Eng. 2005 Jun; 11(2): 207–38.
19. Sproat R, Shih C, Gale W, Chang N. A stochastic finite-state word-segmentation algorithm for Chinese. arXiv preprint cmp-lg/9405008. 1994 May 3.


Regular Issue Subscription Review Article
Volume 01
Issue 01
Received July 24, 2023
Accepted July 26, 2023
Published September 4, 2023