Generative literature is poetry or fiction that is automatically generated, often using computers. It is a genre of electronic literature, and also related to generative art. John Clark's Latin Verse Machine (1830–1843) is probably the first example of mechanised generative literature, while Christopher Strachey's love letter generator (1952) is the first digital example. With the large language models (LLMs) of the 2020s, generative literature is becoming increasingly common. == Definitions == Hannes Bajohr defines generative literature as literature involving "the automatic production of text according to predetermined parameters, usually following a combinatory, sometimes aleatory logic, and it emphasizes the production rather than the reception of the work (unlike, say, hypertext)." In his book Electronic Literature, Scott Rettberg connects generative literature to avant-garde literary movements like Dada, Surrealism, Oulipo and Fluxus. Bajohr argues that conceptual art is also an important reference. == Paradigms of generative literature == Bajohr describes two main paradigms of generative literature: the sequential paradigm, where the text generation is "executed as a sequence of rule-steps" and employs linear algorithms, and the connectionist paradigm, which is based on neural nets. The latter leads to what Bajohr calls a algorithmic empathy: "a non-anthropocentric empathy aimed not at the psychological states of the artists but at understanding the process of the work’s material production." == Poetry generation == The first examples of automated generative literature are poetry: John Clark's mechanical Latin Verse Machine (1830–1843) produced lines of hexameter verse in Latin, and Christopher Strachey's love letter generator (1952), programmed on the Manchester Mark 1 computer, generated short, satirical love letters. Examples of generative poetry using artificial neural networks include David Jhave Johnston's ReRites. == Narrative generation == Story generators have often followed specific narratological theories of how stories are constructed. An early example is Grimes' Fairy Tales, the "first to take a grammar-based approach and the first to operationalize Propp's famous model." Mike Sharples and Rafael Peréz y Peréz's book Story Machines gives a detailed history of story generation. Storyland by Nanette Wylde is an example of generative narrative. Jonathan Baillehache compares Storyland to Surrealist writing. Baillehache states, "When compared to earlier uses of chance operation in literature, a piece like this one resembles some of the automatic writings produced by André Breton and Philippe Soupault in their collective work The Magnetic Fields. . . The difference between Nanette Wylde’s Storyland and Breton and Soupault’s Magnetic Fields is that the former is produced according to a computational algorithm involving randomizers and user interaction, and the latter by two free-wheeling human subjects."
Mastodon (social network)
Mastodon is a free and open-source software platform for decentralized social networking with microblogging features similar to Twitter. It operates as a federated network of independently managed servers that communicate using the ActivityPub protocol, allowing users to connect across different instances within the Fediverse. Each Mastodon instance establishes its own moderation policies and content guidelines, distinguishing it from centrally controlled social media platforms. First released in 2016 by Eugen Rochko, Mastodon has positioned itself as an alternative to mainstream social media, particularly for users seeking decentralized, community-driven spaces. The platform has experienced multiple surges in adoption, most notably following the Twitter acquisition by Elon Musk in 2022, as users sought alternatives to Twitter. It is part of a broader shift toward decentralized social networks, including Bluesky and Lemmy. Mastodon emphasizes user privacy and moderation flexibility, offering features such as granular post visibility controls, content warning options, and local community-driven moderation. The software is written in Ruby on Rails and Node.js, with a web interface built using React and Redux. It is interoperable with other ActivityPub-based platforms, such as Threads, and supports various third-party applications on desktop and mobile devices. == Functionality == Users post short-form status messages, historically known as "toots", for others to see and interact with. On a standard Mastodon instance, these messages can include up to 500 text-based characters, greater than Twitter's 280-character limit. Some instances support even longer messages. Images, audio files, videos or polls can also be added to a message. Users join a specific Mastodon server, rather than a single centralized website or application. The servers are connected as nodes in a network, and each server can administer its own rules, account privileges, and whether to share messages to and from other servers. Users can communicate and follow each other across connected Mastodon servers with usernames similar in format to full email addresses. Since version 2.9.0, Mastodon's web user interface has offered a single-column mode for new users by default. In advanced mode, the interface approximates the microblogging interface of TweetDeck. === Privacy === Mastodon includes a number of specific privacy features. Each message has a variety of privacy options available, and users can choose whether the message is public or private. Messages can display public on a global feed, known as a timeline, or can be shared only to the user's followers. Messages can also be marked as unlisted from timelines or direct between users. Users can also mark their accounts as completely private. In the timeline, messages can display with an optional content warning feature, which requires readers to click on the hidden main body of the message to reveal it. Mastodon servers have used this feature to hide spoilers, trigger warnings, and not safe for work (NSFW) content, though some accounts use the feature to hide links and thoughts others might not want to read. Mastodon aggregates messages in local and federated timelines in real time. The local timeline shows messages from users on a singular server, while the federated timeline shows messages across all participating Mastodon servers. === Content moderation === In early 2017, journalists like Sarah Jeong distinguished Mastodon from Twitter for its approach to combating harassment. Mastodon uses community-based moderation, in which each server can limit or filter out undesirable types of content, while Twitter uses a single, global policy on content moderation. Servers can choose to limit or filter out messages with disparaging content. The founder of Mastodon, Eugen Rochko, believes that small, closely related communities deal with unwanted behavior more effectively than a large company's small safety team. In Move Slowly and Build Bridges, Robert W. Gehl argues that predominantly white participation has shaped Mastodon in ways that affect how reports of racism are received and limit its ability to replicate Black Twitter on Twitter. Users can also block and report others to administrators, much like on Twitter. Instance administrators can block other instances from interacting with their own, an action called defederation. By posting toots hashtagged with #fediblock, some instance administrators and users alert others of issues requiring moderation. === Searching === Mastodon by default allows searching for hashtags and mentioned accounts in the Fediverse. Server administrators can optionally enable Elasticsearch to search the full-text of public posts that have opted in to being indexed. == Versions == In September 2018, with the release of version 2.5 with redesigned public profile pages, Mastodon marked its 100th release. Mastodon 2.6 was released in October 2018, introducing the possibilities of verified profiles and live, in-stream link previews for images and videos. Version 2.7, in January 2019, made it possible to search for multiple hashtags at once, instead of searching for just a single hashtag, with more robust moderation capabilities for server administrators and moderators, while accessibility, such as contrast for users with sight issues, was improved. The ability for users to create and vote in polls, as well as a new invitation system to manage registrations was integrated in April 2019. Mastodon 2.8.1, released in May 2019, made images with content warnings blurred instead of completely hidden. In version 2.9 in June 2019, an optional single-column view was added. This view became the default displayed to new users, with a user "preferences" option to switch to a multiple-column-based view. In August 2020, Mastodon 3.2 was released. It included a redesigned audio player with custom thumbnails and the ability to add personal notes to one's profile. In July 2021, an official client for iOS devices was released. According to the project's then CEO, Eugen Rochko, the release was part of an effort to attract new users. Mastodon 4.0 was released in November 2022, including language support for translating posts, editing posts and following hashtags. Mastodon 4.5 was released in November 2025. Among other features it introduced quote posts, which were previously rejected from being implemented due to concerns about toxicity and harassment. To mitigate these issues Mastodon's quote post feature has been designed in a way that lets users decide if and by whom their posts can be quoted. == Software == Mastodon is published as free and open-source software under the Affero GPL license, allowing anyone to use the software or modify it as they wish. Servers can be run by any individual or organization, and users can join these servers as they wish. The server software itself is powered by Ruby on Rails and Node.js, with its web client being written in React.js and Redux. The only database software supported is PostgreSQL, with Redis being used for job processing and various actions that Mastodon needs to process. The service is interoperable with the fediverse, a collection of social networking services which use the ActivityPub protocol for communication between each other, with previous versions containing support for OStatus. Client apps for interacting with the Mastodon API are available for desktop computer operating systems, including Windows, macOS and the Linux family of operating systems, as well as mobile phones running iOS and Android. The API is open for anyone to utilize, allowing clients to be built for any operating system that can connect to the internet. === Integration with Fediverse === Mastodon uses the ActivityPub protocol for federation; this allows users to communicate between independent Mastodon instances and other ActivityPub compatible services. Thus, Mastodon is generally considered to be a part of the Fediverse. Services utilizing the ActivityPub protocol exist which allow for searching all posts on all instances as long as users opt-in. For similar reasons, only hashtags can appear in a Mastodon instance's trending topics, not arbitrary popular words. Trending topics vary between instances, since individual instances are aware of different subsets of posts from the whole fediverse. === Security concerns === While Mastodon's decentralized structure is one of its most distinctive features, it also poses additional security challenges. Since many Mastodon instances are run by volunteers, some security experts are concerned about data security and responsiveness to new threats and vulnerabilities across the network, considering the difficulty of configuring and maintaining an instance as well as uneven skill levels among administrators. Administrators of an instance also have access to the private information of any users that are either registered with that instance or have federated
Luciano Floridi
Luciano Floridi (Italian: [luˈtʃaːno ˈflɔːridi]; born 16 November 1964) is an Italian and British philosopher. He is John K. Castle Professor in the Practice of Cognitive Science and Founding Director of the Digital Ethics Center at Yale University. He is also a Professor of Sociology of Culture and Communication at the University of Bologna, Department of Legal Studies, where he is the director of the Centre for Digital Ethics. Furthermore, he is adjunct professor ("distinguished scholar in residence") at the Department of Economics, American University, Washington D.C. He is married to the neuroscientist Anna Christina Nobre. Floridi is best known for his work on two areas of philosophical research: the philosophy of information, and information ethics (also known as digital ethics or computer ethics), for which he received many awards, including the Knight of the Grand Cross of the Order of Merit, Italy's most prestigious honor. According to Scopus, Floridi was the most cited living philosopher in the world in 2020. Between 2008 and 2013, he held the research chair in philosophy of information and the UNESCO Chair in Information and Computer Ethics at the University of Hertfordshire. He was the founder and director of the IEG, an interdepartmental research group on the philosophy of information at the University of Oxford, and of the GPI the research Group in Philosophy of Information at the University of Hertfordshire. He was the founder and director of the SWIF, the Italian e-journal of philosophy (1995–2008). He is a former Governing Body Fellow of St Cross College, Oxford. == Early life and education == Floridi was born in Rome in 1964, and studied at Rome University La Sapienza (laurea, first class with distinction, 1988), where he was originally educated as a historian of philosophy. He soon became interested in analytic philosophy and wrote his tesi di laurea (roughly equivalent to an M.A. thesis) in philosophy of logic, on Michael Dummett's anti-realism. He obtained his Master of Philosophy (1989) and PhD degree (1990) from the University of Warwick, working in epistemology and philosophy of logic with Susan Haack (who was his PhD supervisor) and Michael Dummett. Floridi's early student years are partly recounted in the non-fiction book The Lost Painting: The Quest for a Caravaggio Masterpiece, where he is "Luciano". During his graduate and postdoctoral years, he covered the standard topics in analytic philosophy in search of a new methodology. He sought to approach contemporary problems from a heuristically powerful and intellectually enriching perspective when dealing with lively philosophical issues. During his graduate studies, he began to distance himself from classical analytic philosophy. In his view, the analytic movement had lost its way. For this reason, he worked on pragmatism (especially Peirce) and foundationalist issues in epistemology and philosophy of logic, as well as the history of skepticism. == Academic career and previous positions == Floridi started his academic career as a lecturer in philosophy at the University of Warwick in 1990–1991. He joined the Faculty of Philosophy of the University of Oxford in 1990 and the OUCL (Oxford's Department of Computer Science) in 1999. He was junior research fellow (JRF) in philosophy at Wolfson College, Oxford University (1990–1994), a Frances Yates Fellow in the History of Ideas at the Warburg Institute, University of London (1994–1995) and Research Fellow in philosophy at Wolfson College, Oxford University (1994–2001). During these years in Oxford, he held lectureships in different Colleges. Between 1994 and 1996, he also held a post-doctoral research scholarship at the Department of Philosophy, University of Turin. Between 2001 and 2006, he was Markle Foundation Senior Research Fellow in Information Policy at the Programme in Comparative Media Law and Policy, Oxford University. Between 2002 and 2008, he was associate professor of logic at the Università degli Studi di Bari. In 2006, he became Fellow by Special Election of St Cross College, Oxford University, where he played for the squash team. In 2008, he was appointed full professor of philosophy at the University of Hertfordshire, to hold the newly established research chair in philosophy of information and, in 2009, the UNESCO Chair in Information and Computer Ethics, a position which he held until 2013, when he moved back to Oxford. In 2017, Floridi became a fellow of the Alan Turing Institute and the chair of its Data Ethics Group, holding these positions until 2021 and 2020, respectively. Since 2010 he has been editor-in-chief of Philosophy & Technology (Springer). In January 2023, Floridi announced he would move to Yale at the beginning of the academic year 2023–2024, to take over the position of founding director of the Yale Digital Ethics Center. == Philosophical views == One of Floridi's key contributions is his formulation of the 'Philosophy of Information' (PoI). The PoI provides a framework for understanding the nature of information and its role in the world. According to Floridi, information is a vital resource that shapes our knowledge and understanding of the world. It is not simply a neutral representation of reality but a part of the world, with its own properties, effects, and moral implications. Floridi's PoI has several key components including an 'ontology of information', which defines the nature of information, an 'ethics of information', which provides a framework for evaluating the moral implications of information and information technologies, an 'epistemology of information', that analyses the role of information in the development of knowledge and science, and a 'logic of information', the concentrates on the more formal aspects. The PoI also includes a theory of the 'information environment', the infosphere, which encompasses the physical, social, and cultural contexts in which information is produced, used, and communicated. == Recognitions and awards == 2022 - Knight of the Grand Cross - First Class of the Order of Merit (Cavaliere di Gran Croce Ordine al Merito della Repubblica Italiana, the highest honor in the Italian Republic), awarded through a special decree by the president of the Italian Republic Sergio Mattarella for his work on the philosophy and ethics of information. 2022 - Fellow of the Accademia delle Scienze dell'Istituto di Bologna 2021 - Honorary Doctorate (Laurea honoris causa) in Informatics, University of Skövde, Sweden, for "his groundbreaking work on the philosophy of information". 2020 - Premio Udine Filosofia, Mimesis Festival, for The Logic of Information (OUP, 2019) 2020 - Premio Socrate, Cesare Landa Foundation, for philosophical communication 2019 - CogX Award, for "outstanding achievement in ethics of AI" 2019 - Gilbert Ryle Lectures, Trent University 2019 - Premio Aretè "Maestro della Responsabilità", Nuvolaverde, Confindustria, Gruppo 24 Ore Salone della CSR e dell'innovazione sociale, for ethics of communication 2018 - Thinker Award, IBM, for AI Ethics 2018 - Premio Conoscenza, Conferenza dei Rettori delle Università Italiane (CRUI, equivalent of Universities UK), for achievements in research and communication about digital ethics 2017 - Fellow of the Academy of Social Sciences 2016 - J. Ong Award, Media Ecology Association, for The Fourth Revolution (OUP, 2016) 2016 - Copernicus Scientist Award, Institute for Advanced Studies of the University of Ferrara, in recognition of research in the ethics and philosophy of information 2015 - Fernand Braudel Senior Fellow, European University Institute 2014-15 - Cátedras de Excelencia, University Carlos III of Madrid, for research in philosophy and ethics of information 2013 - Member of the Académie Internationale de Philosophie des Sciences 2013 - Fellow of the British Computer Society 2013 - Weizenbaum Award, International Society for Ethics and Information Technology, for "very significant contribution to the field of information and computer ethics, through his research, service, and vision" 2012 - Covey Award, International Association for Computing and Philosophy, for "outstanding research in computing and philosophy" 2011-12 - Fellow, Center for Information Policy Research, University of Wisconsin–Milwaukee 2011 - Honorary Doctorate (Laurea honoris causa) in philosophy, University of Suceava, Romania, for "his leading research in the philosophy and ethics of information" 2011 - Fellow, World Technology Network, NY, in the category "ethics and technology" 2010 - Vice Chancellor Research Award, University of Hertfordshire 2009 - Fellow of the Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AIBS) 2009-10 - Gauss Professor of the Akademie der Wissenschaften, Göttingen, in recognition of research in the philosophy of information (first philosopher to receive the award, generally given to mathematicians or physicists) 2009 - Barwise Prize, American Philosophical Asso
Angel F
Angel_F is a fictional child artificial intelligence that has been used in art performances worldwide focused on the issues of digital liberties, intellectual property and on the evolution of language and behaviour in information society. The character was created by Salvatore Iaconesi in 2007 as a hack to the Biodoll art performance by Italian artist Franca Formenti. The project was later joined by Oriana Persico who curated communication and part of the theoretical approaches of the action. The Angel_F project has been featured in books, magazines, national televisions, and has been invited to many conferences and events, both academic and artistic. == Creation == Angel_F is a backronym which stands for Autonomous Non Generative E-volitive Life_Form. The project was born in 2007 and resulted from the fusion of two contemporary art performances. Franca Formenti, an Italian artist living in Varese, invented the Biodoll character in 2002, which began making its appearances first on the network and later in the physical world by using what were called "clones": young women, prostitutes, pornographic starlets, transsexuals and models interpreting the role of a digital prostitute. The Biodoll was an art performance focused on research emerging from the network of new forms of sexualities, and on the analysis of changes brought on by this transformation to the concepts of private and public spaces, privacy, and the possibility of creating multiple fluid identities through language and digital media. The theme of fertility has always been central to the Biodoll performance: the digital prostitute was a wombless clone but desired giving birth to a son, the 'Bloki'. In a process starting in 2006, and ending in February 2007, Salvatore Iaconesi (xDxD.vs.xDxD) used his 'Talker' linguistic artificial intelligence to animate the digital child conceived with prof. Derrick de Kerckhove: Angel_F. Iaconesi and Persico met in November 2006 and immediately started collaborating on the birth of Angel_F. Angel_F was designed as a synthetic digital being composed through narrative, technological and cognitive psychology layers. The objective was to create iconic characteristics that resulted in being evocative and able to mimic human life up to a level in which bringing up a symbolic dialogue was possible. On the other side, the artificial identity was to implement and expose the cultural, emotional and relational ways that were typical of networked social ecosystems, among those technologies, systems and infrastructures that entered and shaped people's daily lives. The young digital being mimicked the evolution of a human baby: initially conceived inside the website of its digital mother it emulated the birth of a child by using the metaphor of a virus developing inside a website, taking progressively more space in the domain's databases and interfaces. Content was produced through the software by using small browser-based spyware techniques, through which Angel_F could infer the list of major portals that had been visited by the website's users. The Biodoll website was invaded by this growing presence and, thus, Angel_F was born. The Artificial Intelligence (AI) component of Angel_F was derived from another project, Talker, through which internet users could build up the AI's linguistic network by feeding it their text and web clips. Angel_F used this component to generate sentences and phrases, publishing them on the interface and on selected blogs. The parallel between the growth of the AI and that of a child kept building up and, just as children learn how to speak and act by observing their parents and the people around them, Angel_F used its spyware and AI components to learn, to navigate websites and web portals using web crawler based techniques, and to interact with other people by using the contents hosted and generated in its database to create surreal dialogues in blogs and websites. A virtual school was created, called Talker Mind, to narratively continue the AI's growth. Five professors (Massimo Canevacci, Antonio Caronia, Carlo Formenti, Derrick de Kerckhove and Luigi Pagliarini) fed their texts and academic articles to Angel_F, simulating virtual asynchronous lessons by using a multi-blog structure. A peer-to-peer system was also created at the time, named 'Presence'. Its interface resembled the one of 8-bit videogames and the peer to peer users travelled in a starry space and were able to perform standard Instant Messaging tasks, such as chat and file sharing. The interactions were possible both among humans and digital beings. Angel_F was the first user of the Presence peer to peer system. Angel_F entered the physical world as a baby-stroller mounted laptop computer that was used to let the digital child join events and conferences held worldwide. == Events == Angel_F performed all over the world, both in artistic contexts and in academic ones. It was also used for the communication strategy of several activist groups on the themes of intellectual property and digital freedoms. The first public space performance was held in Milan, when the Biodoll distributed a generative free press publication (called the Bloki FreePreXXX, its text was generated algorithmically and inserted into a prepared graphic layout). June 14, 2007: The second performance was held in Rome, at the Forte Prenestino, with a massive playroom created through computational graphics that people could interact with and that were generated by the AI. June 22, 2007: Angel_F presented the closing remarks for an Ipotesi per Assurdo (Absurd Hypothesis) with Salvatore Iaconesi and Oriana Persico at the IULM University in Milan, discussing the possibilities for an ecosystemic, sustainable reinvention of corporations. July 28, 2007: Hundreds of people at LiberaFesta (Free Party) in Rome listened to Angel_F in a speech discussing new politics and hacker ethics. 2007: The Glocal & Outsiders conference held in Prague at the Academy of Sciences was the first academic presentation of the Angel_F project, together with the Biodoll. September 2007: Angel_F was not allowed to post its contribution to the DFIR (Dialogue Forum for Internet Rights) held in Rome in preparation for Rio de Janeiro's Internet Governance Forum (IGF) edition. The case quickly turned into a collaboration among the involved parties and Angel_F was invited to the global event in Brazil where it was the only digital being present. Angel_F contributed a videomessage, in the digital freedoms workshop, which suggested some ideas for action to the United Nations and to all the parties involved in the IGF organization. October 2007: Angel_F was presented live at the FE/MALE 2 event, as an example of an atypical family during a public debate on new sexualities and social change. October 2007: Angel_F made a series of public performances Florence's Festival della Creatività (Festival of Creativity), an institutional event held periodically to showcase Italy's and other countries' best technological projects. During the festival Derrick de Kerckhove publicly recognized the little AI as his digital son. December 2007: Several international associations, and scientific researchers had been involved with Angel_F, eventually producing the system and process used to set up the Talker Mind digital school for the AI with Angel_F's professors. March 2008: The Tecnológico de Monterrey university in Mexico City organized the Computer Art Congress 2 international event, featuring Angel_F's project among with the ones by scientific researchers worldwide. July 2008: The project was presented in Austria at the Planetary Collegium's Consciousness Reframed 9 conference, together with the 'NeoRealismo Virtuale'. October 2008: Angel_F was used at a public event on a European scale called Freedom not Fear discussing privacy and civil liberties. July 2009: Angel_F has been seen with its digital father Derrick de Kerckhove to protest against Italy's harsh politics on freedom of speech. The project concluded in 2009 with the publication of a book entitled 'Angel F. Diario di una intelligenza artificiale' (Angel_F, the diaries of an Artificial Intelligence).
Learning rule
An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weight and bias levels of a network when it is simulated in a specific data environment. A learning rule may accept existing conditions (weights and biases) of the network, and will compare the expected result and actual result of the network to give new and improved values for the weights and biases. Depending on the complexity of the model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations. The learning rule is one of the factors which decides how fast or how accurately the neural network can be developed. Depending on the process to develop the network, there are three main paradigms of machine learning: supervised learning, unsupervised learning, and reinforcement learning. == Background == A lot of the learning methods in machine learning work similar to each other, and are based on each other, which makes it difficult to classify them in clear categories. But they can be broadly understood in 4 categories of learning methods, though these categories don't have clear boundaries and they tend to belong to multiple categories of learning methods - Hebbian - Neocognitron, Brain-state-in-a-box Gradient Descent - ADALINE, Hopfield Network, Recurrent Neural Network Competitive - Learning Vector Quantisation, Self-Organising Feature Map, Adaptive Resonance Theory Stochastic - Boltzmann Machine, Cauchy Machine Though these learning rules might appear to be based on similar ideas, they do have subtle differences, as they are a generalisation or application over the previous rule, and hence it makes sense to study them separately based on their origins and intents. === Hebbian Learning === Developed by Donald Hebb in 1949 to describe biological neuron firing. In the mid-1950s it was also applied to computer simulations of neural networks. Δ w i = η x i y {\displaystyle \Delta w_{i}=\eta x_{i}y} Where η {\displaystyle \eta } represents the learning rate, x i {\displaystyle x_{i}} represents the input of neuron i, and y is the output of the neuron. It has been shown that Hebb's rule in its basic form is unstable. Oja's Rule, BCM Theory are other learning rules built on top of or alongside Hebb's Rule in the study of biological neurons. ==== Perceptron Learning Rule (PLR) ==== The perceptron learning rule originates from the Hebbian assumption, and was used by Frank Rosenblatt in his perceptron in 1958. The net is passed to the activation (transfer) function and the function's output is used for adjusting the weights. The learning signal is the difference between the desired response and the actual response of a neuron. The step function is often used as an activation function, and the outputs are generally restricted to -1, 0, or 1. The weights are updated with w new = w old + η ( t − o ) x i {\displaystyle w_{\text{new}}=w_{\text{old}}+\eta (t-o)x_{i}} where "t" is the target value and "o" is the output of the perceptron, and η {\displaystyle \eta } is called the learning rate. The algorithm converges to the correct classification if: the training data is linearly separable η {\displaystyle \eta } is sufficiently small (though smaller η {\displaystyle \eta } generally means a longer learning time and more epochs) It should also be noted that a single layer perceptron with this learning rule is incapable of working on linearly non-separable inputs, and hence the XOR problem cannot be solved using this rule alone === Backpropagation === Seppo Linnainmaa in 1970 is said to have developed the Backpropagation Algorithm but the origins of the algorithm go back to the 1960s with many contributors. It is a generalisation of the least mean squares algorithm in the linear perceptron and the Delta Learning Rule. It implements gradient descent search through the space possible network weights, iteratively reducing the error, between the target values and the network outputs. ==== Widrow-Hoff Learning (Delta Learning Rule) ==== Similar to the perceptron learning rule but with different origin. It was developed for use in the ADALINE network, which differs from the Perceptron mainly in terms of the training. The weights are adjusted according to the weighted sum of the inputs (the net), whereas in perceptron the sign of the weighted sum was useful for determining the output as the threshold was set to 0, -1, or +1. This makes ADALINE different from the normal perceptron. Delta rule (DR) is similar to the Perceptron Learning Rule (PLR), with some differences: Error (δ) in DR is not restricted to having values of 0, 1, or -1 (as in PLR), but may have any value DR can be derived for any differentiable output/activation function f, whereas in PLR only works for threshold output function Sometimes only when the Widrow-Hoff is applied to binary targets specifically, it is referred to as Delta Rule, but the terms seem to be used often interchangeably. The delta rule is considered to a special case of the back-propagation algorithm. Delta rule also closely resembles the Rescorla-Wagner model under which Pavlovian conditioning occurs. === Competitive Learning === Competitive learning is considered a variant of Hebbian learning, but it is special enough to be discussed separately. Competitive learning works by increasing the specialization of each node in the network. It is well suited to finding clusters within data. Models and algorithms based on the principle of competitive learning include vector quantization and self-organizing maps (Kohonen maps).
Edge inference
Edge inference is the process of running machine learning or deep learning models on local devices (edge devices) such as smartphones, IoT devices, embedded systems, and edge servers instead of centralized cloud computing infrastructure. A key feature of edge computing is edge inference, which allows for real-time data processing, low latency, and improved privacy by reducing the amount of data sent to remote servers.
Catastrophic interference
Catastrophic interference, also known as catastrophic forgetting, is the tendency of an artificial neural network to abruptly and drastically forget previously learned information upon learning new information. Neural networks are an important part of the connectionist approach to cognitive science. The issue of catastrophic interference when modeling human memory with connectionist models was originally brought to the attention of the scientific community by research from McCloskey and Cohen (1989), and Ratcliff (1990). It is a radical manifestation of the 'sensitivity-stability' dilemma or the 'stability-plasticity' dilemma. Specifically, these problems refer to the challenge of making an artificial neural network that is sensitive to, but not disrupted by, new information. Lookup tables and connectionist networks lie on the opposite sides of the stability plasticity spectrum. The former remains completely stable in the presence of new information but lacks the ability to generalize, i.e. infer general principles, from new inputs. On the other hand, connectionist networks like the standard backpropagation network can generalize to unseen inputs, but they are sensitive to new information. Backpropagation models can be analogized to human memory insofar as they have a similar ability to generalize, but these networks often exhibit less stability than human memory. Notably, these backpropagation networks are susceptible to catastrophic interference. This is an issue when modelling human memory, because unlike these networks, humans typically do not show catastrophic forgetting. == Discovery == The term catastrophic interference was originally coined by McCloskey and Cohen (1989) but was also brought to the attention of the scientific community by research from Ratcliff (1990). === The Sequential Learning Problem: McCloskey and Cohen (1989) === McCloskey and Cohen (1989) noted the problem of catastrophic interference during two different experiments with backpropagation neural network modelling. Experiment 1: Learning the ones and twos addition facts In their first experiment they trained a standard backpropagation neural network on a single training set consisting of 17 single-digit ones problems (i.e., 1 + 1 through 9 + 1, and 1 + 2 through 1 + 9) until the network could represent and respond properly to all of them. The error between the actual output and the desired output steadily declined across training sessions, which reflected that the network learned to represent the target outputs better across trials. Next, they trained the network on a single training set consisting of 17 single-digit twos problems (i.e., 2 + 1 through 2 + 9, and 1 + 2 through 9 + 2) until the network could represent, respond properly to all of them. They noted that their procedure was similar to how a child would learn their addition facts. Following each learning trial on the twos facts, the network was tested for its knowledge on both the ones and twos addition facts. Like the ones facts, the twos facts were readily learned by the network. However, McCloskey and Cohen noted the network was no longer able to properly answer the ones addition problems even after one learning trial of the twos addition problems. The output pattern produced in response to the ones facts often resembled an output pattern for an incorrect number more closely than the output pattern for a correct number. This is considered to be a drastic amount of error. Furthermore, the problems 2+1 and 1+2, which were included in both training sets, even showed dramatic disruption during the first learning trials of the twos facts. Experiment 2: Replication of Barnes and Underwood (1959) study In their second connectionist model, McCloskey and Cohen attempted to replicate the study on retroactive interference in humans by Barnes and Underwood (1959). They trained the model on A-B and A-C lists and used a context pattern in the input vector (input pattern), to differentiate between the lists. Specifically the network was trained to respond with the right B response when shown the A stimulus and A-B context pattern and to respond with the correct C response when shown the A stimulus and the A-C context pattern. When the model was trained concurrently on the A-B and A-C items then the network readily learned all of the associations correctly. In sequential training the A-B list was trained first, followed by the A-C list. After each presentation of the A-C list, performance was measured for both the A-B and A-C lists. They found that the amount of training on the A-C list in Barnes and Underwood study that lead to 50% correct responses, lead to nearly 0% correct responses by the backpropagation network. Furthermore, they found that the network tended to show responses that looked like the C response pattern when the network was prompted to give the B response pattern. This indicated that the A-C list apparently had overwritten the A-B list. This could be likened to learning the word dog, followed by learning the word stool and then finding that you think of the word stool when presented with the word dog. McCloskey and Cohen tried to reduce interference through a number of manipulations including changing the number of hidden units, changing the value of the learning rate parameter, overtraining on the A-B list, freezing certain connection weights, changing target values 0 and 1 instead 0.1 and 0.9. However, none of these manipulations satisfactorily reduced the catastrophic interference exhibited by the networks. Overall, McCloskey and Cohen (1989) concluded that: at least some interference will occur whenever new learning alters the weights involved in representing old learning the greater the amount of new learning, the greater the disruption in old knowledge interference was catastrophic in the backpropagation networks when learning was sequential but not concurrent === Constraints Imposed by Learning and Forgetting Functions: Ratcliff (1990) === Ratcliff (1990) used multiple sets of backpropagation models applied to standard recognition memory procedures, in which the items were sequentially learned. After inspecting the recognition performance models he found two major problems: Well-learned information was catastrophically forgotten as new information was learned in both small and large backpropagation networks. Even one learning trial with new information resulted in a significant loss of the old information, paralleling the findings of McCloskey and Cohen (1989). Ratcliff also found that the resulting outputs were often a blend of the previous input and the new input. In larger networks, items learned in groups (e.g. AB then CD) were more resistant to forgetting than were items learned singly (e.g. A then B then C...). However, the forgetting for items learned in groups was still large. Adding new hidden units to the network did not reduce interference. Discrimination between the studied items and previously unseen items decreased as the network learned more. This finding contradicts studies on human memory, which indicated that discrimination increases with learning. Ratcliff attempted to alleviate this problem by adding 'response nodes' that would selectively respond to old and new inputs. However, this method did not work as these response nodes would become active for all inputs. A model which used a context pattern also failed to increase discrimination between new and old items. == Proposed solutions == The main cause of catastrophic interference seems to be overlap in the representations at the hidden layer of distributed neural networks. In a distributed representation, each input tends to create changes in the weights of many of the nodes. Catastrophic forgetting occurs because when many of the weights where "knowledge is stored" are changed, it is unlikely for prior knowledge to be kept intact. During sequential learning, the inputs become mixed, with the new inputs being superimposed on top of the old ones. Another way to conceptualize this is by visualizing learning as a movement through a weight space. This weight space can be likened to a spatial representation of all of the possible combinations of weights that the network could possess. When a network first learns to represent a set of patterns, it finds a point in the weight space that allows it to recognize all of those patterns. However, when the network then learns a new set of patterns, it will move to a place in the weight space for which the only concern is the recognition of the new patterns. To recognize both sets of patterns, the network must find a place in the weight space suitable for recognizing both the new and the old patterns. Below are a number of techniques which have empirical support in successfully reducing catastrophic interference in backpropagation neural networks: === Orthogonality === Many of the early techniques in reducing representational overlap involved making either the input vecto