Showing posts with label Automatic Film Editing. Show all posts
Showing posts with label Automatic Film Editing. Show all posts

Artificial Intelligence - What Is Deep Learning?

 



Deep learning is a subset of methods, tools, and techniques in artificial intelligence or machine learning.

Learning in this case involves the ability to derive meaningful information from various layers or representations of any given data set in order to complete tasks without human instruction.

Deep refers to the depth of a learning algorithm, which usually involves many layers.

Machine learning networks involving many layers are often considered to be deep, while those with only a few layers are considered shallow.

The recent rise of deep learning over the 2010s is largely due to computer hardware advances that permit the use of computationally expensive algorithms and allow storage of immense datasets.

Deep learning has produced exciting results in the fields of computer vision, natural language, and speech recognition.

Notable examples of its application can be found in personal assistants such as Apple’s Siri or Amazon Alexa and search, video, and product recommendations.

Deep learning has been used to beat human champions at popular games such as Go and Chess.

Artificial neural networks are the most common form of deep learning.

Neural networks extract information through multiple stacked layers commonly known as hidden layers.





These layers contain artificial neurons, which are connected independently via weights to neurons in other layers.

Neural networks often involve dense or fully connected layers, meaning that each neuron in any given layer will connect to every neuron of its preceding layer.

This allows the network to learn increasingly intricate details or be trained by the data passing through each subsequent layer.

Part of what separates deep learning from other forms of machine learning is its ability to work with unstructured data.

There are no pre-arranged labels or characteristics in unstructured data.

Deep learning algorithms can learn to link their own features with unstructured inputs using several stacked layers.

This is done by the hierarchical approach in which a deep multi-layered learning algorithm offers more detailed information with each successive layer, enabling it to break down a very complicated issue into a succession of lesser ones.

This enables the network to learn more complex information or to be taught by data provided via successive layers.

The following steps are used to train a network: Small batches of tagged data are sent over the network first.

The loss of the network is determined by comparing predictions to real labels.

Back propagation is used to compute and transmit any inconsistencies to the weights.

Weights are tweaked gradually in order to keep losses to a minimum throughout each round of predictions.

The method is repeated until the network achieves optimum loss reduction and high accuracy of accurate predictions.

Deep learning has an advantage over many machine learning approaches and shallow learning networks since it can self-optimize its layers.

Machine or shallow learning methods need human participation in the preparation of unstructured data for input, often known as feature engineering, since they only have a few layers at most.





This may be a lengthy procedure that takes much too much time to be profitable, particularly if the dataset is enormous.

As a result of these factors, machine learning algorithms may seem to be a thing of the past.

Deep learning algorithms, on the other hand, come at a price.

Finding their own characteristics requires a large quantity of data, which isn't always accessible.

Furthermore, as data volumes get larger, so do the processing power and training time requirements, since the network will be dealing with a lot more data.

Depending on the number and kinds of layers utilized, training time will also rise.

Fortunately, online computing, which lets anybody to rent powerful machines for a price, allows anyone to run some of the most demanding deep learning networks.

Convolutional neural networks need hidden layers that are not included in the standard neural network design.

Deep learning of this kind is most often connected with computer vision projects, and it is now the most extensively used approach in that sector.

In order to obtain information from an image, basic convnet networks would typically utilize three kinds of layers: convolutional layers, pooling layers, and dense layers.

Convolutional layers gather information from low-level features such as edges and curves by sliding a window, or convolutional kernel, over the picture.

Subsequent stacked convolutional layers will repeat this procedure over the freshly generated layers of low-level features, looking for increasingly higher-level characteristics until the picture is fully understood.

Different hyperparameters may be modified to find different sorts of features, such as the size of the kernel or the distance it glides over the picture.

Pooling layers enable a network to learn higher-level elements of an image in a progressive manner by down sampling the picture along the way.

The network may become too computationally costly without a pooling layer built amid convolutional layers as each successive layer examines more detailed data.

In addition, the pooling layer reduces the size of an image while preserving important details.

These characteristics become translation invariant, which means that a feature seen in one portion of an image may be identified in a totally other region of the same picture.

The ability of a convolutional neural network to retain positional information is critical for image classification.

The ability of deep learning to automatically parse through unstructured data to find local features that it deems important while retaining positional information about how these features interact with one another demonstrates the power of convolutional neural networks.

Recurrent neural networks excel at sequence-based tasks like sentence completion and stock price prediction.

The essential idea is that, unlike previous instances of networks in which neurons just transmit information forward, neurons in recurrent neural networks feed information forward while also periodically looping the output back to itself throughout a time step.

Recurrent neural networks may be regarded of as having a rudimentary type of memory since each time step includes recurrent information from all previous time steps.

This is often utilized in natural language processing projects because recurrent neural networks can handle text in a way that is more human-like.

Instead of seeing a phrase as a collection of isolated words, a recurrent neural network may begin to analyse the mood of the statement or even create the following sentence autonomously depending on what has already been stated.

In many respects akin to human talents, deep learning may give strong techniques of evaluating unstructured data.

Unlike humans, deep learning networks never get tired.

Deep learning may substantially outperform standard machine learning techniques when given enough training data and powerful computers, particularly given its autonomous feature engineering capabilities.

Image classification, voice recognition, and self-driving vehicles are just a few of the fields that have benefited tremendously from deep learning research over the previous decade.

Many new exciting deep learning applications will emerge if current enthusiasm and computer hardware upgrades continue to grow.


~ Jai Krishna Ponnappan

You may also want to read more about Artificial Intelligence here.



See also: 


Automatic Film Editing; Berger-Wolf, Tanya; Cheng, Lili; Clinical Decision Support Systems; Hassabis, Demis; Tambe, Milind.


Further Reading:


Chollet, François. 2018. Deep Learning with Python. Shelter Island, NY: Manning Publications.

Géron, Aurélien. 2019. Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. Second edition. Sebastopol, CA: O’Reilly Media.

Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2017. Deep Learning. Cambridge, MA: MIT Press.

Artificial Intelligence - What Is Computational Creativity?

 



Computational Creativity is a term used to describe a kind of creativity that is based on Computer-generated art is connected to computational creativity, although it is not reducible to it.

According to Margaret Boden, "CG-art" is an artwork that "results from some computer program being allowed to operate on its own, with zero input from the human artist" (Boden 2010, 141).

This definition is both severe and limiting, since it is confined to the creation of "art works" as defined by human observers.

Computational creativity, on the other hand, is a broader phrase that encompasses a broader range of actions, equipment, and outputs.

"Computational creativity is an area of Artificial Intelligence (AI) study... where we construct and engage with computational systems that produce products and ideas," said Simon Colton and Geraint A. Wiggins.

Those "artefacts and ideas" might be works of art, as well as other things, discoveries, and/or performances (Colton and Wiggins 2012, 21).

Games, narrative, music composition and performance, and visual arts are examples of computational creativity applications and implementations.

Games and other cognitive skill competitions are often used to evaluate and assess machine skills.

The fundamental criterion of machine intelligence, in fact, was established via a game, which Alan Turing dubbed "The Game of Imitation" (1950).

Since then, AI progress and accomplishment have been monitored and evaluated via games and other human-machine contests.

Chess has had a special status and privileged position among all the games in which computers have been involved, to the point where critics such as Douglas Hofstadter (1979, 674) and Hubert Dreyfus (1992) confidently asserted that championship-level AI chess would forever remain out of reach and unattainable.

After beating Garry Kasparov in 1997, IBM's Deep Blue modified the game's rules.

But chess was just the start.

In 2015, AlphaGo, a Go-playing algorithm built by Google DeepMind, defeated Lee Sedol, one of the most famous human players of this notoriously tough board game, in four out of five games.

Human observers, including as Fan Hui (2016), have praised AlphaGo's nimble play as "beautiful," "intuitive," and "innovative." 'Automated Insights' is a service provided by Automated Insights Natural Language Generation (NLG) techniques such as Wordsmith and Narrative Science's Quill are used to create human-readable tales from machine-readable data.

Unlike basic news aggregators or template NLG systems, these computers "write" (or "produce," as the case may be) unique tales that are almost indistinguishable from human-created material in many cases.

Christer Clerwall, for example, performed a small-scale research in 2014 in which human test subjects were asked to assess news pieces written by Wordsmith and a professional writer from the Los Angeles Times.

The study's findings reveal that, although software-generated information is often seen as descriptive and dull, it is also regarded as more impartial and trustworthy (Clerwall 2014, 519).

"Within 10 years, a digital computer would produce music regarded by critics as holding great artistic merit," Herbert Simon and Allen Newell predicted in their famous article "Heuristic Problem Solving" (1958). (Simon and Newell 1958, 7).

This prediction has come true.

Experiments in Musical Intelligence (EMI, or "Emmy") by David Cope is one of the most well-known works in the subject of "algorithmic composition." 

Emmy is a computer-based algorithmic composer capable of analyzing existing musical compositions, rearranging their fundamental components, and then creating new, unique scores that sound like and, in some circumstances, are indistinguishable from Mozart, Bach, and Chopin's iconic masterpieces (Cope 2001).

There are robotic systems in music performance, such as Shimon, a marimba-playing jazz-bot from Georgia Tech University, that can not only improvise with human musicians in real time, but also "is designed to create meaningful and inspiring musical interactions with humans, leading to novel musical experiences and outcomes" (Hoffman and Weinberg 2011).

Cope's method, which he refers to as "recombinacy," is not restricted to music.

It may be used and applied to any creative technique in which new works are created by reorganizing or recombining a set of finite parts, such as the alphabet's twenty-six letters, the musical scale's twelve tones, the human eye's sixteen million colors, and so on.

As a result, other creative undertakings, like as painting, have adopted similar computational creativity method.

The Painting Fool is an automated painter created by Simon Colton that seeks to be "considered seriously as a creative artist in its own right" (Colton 2012, 16).

To far, the algorithm has generated thousands of "original" artworks, which have been shown in both online and physical art exhibitions.

Obvious, a Paris-based collaboration comprised of the artists Hugo Caselles-Dupré, Pierre Fautrel, and Gauthier Vernie, uses a generative adversarial network (GAN) to create portraits of a fictitious family (the Belamys) in the manner of the European masters.

Christies auctioned one of these pictures, "Portrait of Edmond Belamy," for $432,500 in October 2018.

Designing ostensibly creative systems instantly runs into semantic and conceptual issues.

Creativity is an enigmatic phenomena that is difficult to pinpoint or quantify.

Are these programs, algorithms, and systems really "creative," or are they merely a sort of "imitation," as some detractors have labeled them? This issue is similar to John Searle's (1984, 32–38) Chinese Room thought experiment, which aimed to highlight the distinction between genuine cognitive activity, such as creative expression, and simple simulation or imitation.

Researchers in the field of computational creativity have introduced and operationalized a rather specific formulation to characterize their efforts: "The philosophy, science, and engineering of computational systems that, by taking on specific responsibilities, exhibit behaviors that unbiased observers would deem creative" (Colton and Wig gins 2012, 21).

The key word in this description is "responsibility." 

"The term responsibilities highlights the difference between the systems we build and creativity support tools studied in the HCI [human-computer interaction] community and embedded in tools like Adobe's Photoshop, to which most observers would probably not attribute creative intent or behavior," Colton and Wiggins explain (Colton and Wiggins 2012, 21).

"The program is only a tool to improve human creativity" (Colton 2012, 3–4) using a software application like Photoshop; it is an instrument utilized by a human artist who is and remains responsible for the creative choices and output created by the instrument.

Computational creativity research, on the other hand, "seeks to develop software that is creative in and of itself" (Colton 2012, 4).

On the one hand, one might react as we have in the past, dismissing contemporary technological advancements as simply another instrument or tool of human action—or what technology philosophers such as Martin Heidegger (1977) and Andrew Feenberg (1991) refer to as "the instrumental theory of technology." 

This is, in fact, the explanation supplied by David Cope in his own appraisal of his work's influence and relevance.

Emmy and other algorithmic composition systems, according to Cope, do not compete with or threaten to replace human composition.

They are just instruments used in and for musical creation.

"Computers represent just instruments with which we stretch our ideas and bodies," writes Cope.

Computers, programs, and the data utilized to generate their output were all developed by humanity.

Our algorithms make music that is just as much ours as music made by our greatest human inspirations" (Cope 2001, 139).

According to Cope, no matter how much algorithmic mediation is invented and used, the musical composition generated by these advanced digital tools is ultimately the responsibility of the human person.

The similar argument may be made for other supposedly creative programs, such as AlphaGo, a Go-playing algorithm, or The Painting Fool, a painting software.

When AlphaGo wins a big tournament or The Painting Fool creates a spectacular piece of visual art that is presented in a gallery, there is still a human person (or individuals) who is (or can reply or answer for) what has been created, according to the argument.

The attribution lines may get more intricate and drawn out, but there is always someone in a position of power behind the scenes, it might be claimed.

In circumstances where efforts have been made to transfer responsibility to the computer, evidence of this already exists.

Consider AlphaGo's game-winning move 37 versus Lee Sedol in game two.

If someone wants to learn more about the move and its significance, AlphaGo is the one to ask.

The algorithm, on the other hand, will remain silent.

In actuality, it was up to the human programmers and spectators to answer on AlphaGo's behalf and explain the importance and effect of the move.

As a result, as Colton (2012) and Colton et al. (2015) point out, if the mission of computational creativity is to succeed, the software will have to do more than create objects and behaviors that humans interpret as creative output.

It must also take ownership of the task by accounting for what it accomplished and how it did it.

"The software," Colton and Wiggins argue, "should be available for questioning about its motivations, processes, and products," eventually capable of not only generating titles for and explanations and narratives about the work but also responding to questions by engaging in critical dialogue with its audience (Colton and Wiggins 2012, 25). (Colton et al. 2015, 15).

At the same time, these algorithmic incursions into what had previously been a protected and solely human realm have created possibilities.

It's not only a question of whether computers, machine learning algorithms, or other applications can or cannot be held accountable for what they do or don't do; it's also a question of how we define, explain, and define creative responsibility in the first place.

This suggests that there is a strong and weak component to this endeavor, which Mohammad Majid al-Rifaie and Mark Bishop refer to as strong and weak forms of computational creativity, reflecting Searle's initial difference on AI initiatives (Majid al-Rifaie and Bishop 2015, 37).

The types of application development and demonstrations presented by people and companies such as DeepMind, David Cope, and Simon Colton are examples of the "strong" sort.

However, these efforts have a "weak AI" component in that they simulate, operationalize, and stress test various conceptualizations of artistic responsibility and creative expression, resulting in critical and potentially insightful reevaluations of how we have defined these concepts in our own thinking.

Nothing has made Douglas Hofstadter reexamine his own thinking about thinking more than the endeavor to cope with and make sense of David Cope's Emmy nomination (Hofstadter 2001, 38).

To put it another way, developing and experimenting with new algorithmic capabilities does not necessarily detract from human beings and what (hopefully) makes us unique, but it does provide new opportunities to be more precise and scientific about these distinguishing characteristics and their limits.


~ Jai Krishna Ponnappan

You may also want to read more about Artificial Intelligence here.



See also: 

AARON; Automatic Film Editing; Deep Blue; Emily Howell; Generative Design; Generative Music and Algorithmic Composition.

Further Reading

Boden, Margaret. 2010. Creativity and Art: Three Roads to Surprise. Oxford, UK: Oxford University Press.

Clerwall, Christer. 2014. “Enter the Robot Journalist: Users’ Perceptions of Automated Content.” Journalism Practice 8, no. 5: 519–31.

Colton, Simon. 2012. “The Painting Fool: Stories from Building an Automated Painter.” In Computers and Creativity, edited by Jon McCormack and Mark d’Inverno, 3–38. Berlin: Springer Verlag.

Colton, Simon, Alison Pease, Joseph Corneli, Michael Cook, Rose Hepworth, and Dan Ventura. 2015. “Stakeholder Groups in Computational Creativity Research and Practice.” In Computational Creativity Research: Towards Creative Machines, edited by Tarek R. Besold, Marco Schorlemmer, and Alan Smaill, 3–36. Amster￾dam: Atlantis Press.

Colton, Simon, and Geraint A. Wiggins. 2012. “Computational Creativity: The Final Frontier.” In Frontiers in Artificial Intelligence and Applications, vol. 242, edited by Luc De Raedt et al., 21–26. Amsterdam: IOS Press.

Cope, David. 2001. Virtual Music: Computer Synthesis of Musical Style. Cambridge, MA: MIT Press.

Dreyfus, Hubert L. 1992. What Computers Still Can’t Do: A Critique of Artificial Reason. Cambridge, MA: MIT Press.

Feenberg, Andrew. 1991. Critical Theory of Technology. Oxford, UK: Oxford University Press.

Heidegger, Martin. 1977. The Question Concerning Technology, and Other Essays. Translated by William Lovitt. New York: Harper & Row.

Hoffman, Guy, and Gil Weinberg. 2011. “Interactive Improvisation with a Robotic Marimba Player.” Autonomous Robots 31, no. 2–3: 133–53.

Hofstadter, Douglas R. 1979. Gödel, Escher, Bach: An Eternal Golden Braid. New York: Basic Books.

Hofstadter, Douglas R. 2001. “Staring Emmy Straight in the Eye—And Doing My Best Not to Flinch.” In Virtual Music: Computer Synthesis of Musical Style, edited by David Cope, 33–82. Cambridge, MA: MIT Press.

Hui, Fan. 2016. “AlphaGo Games—English. DeepMind.” https://web.archive.org/web/20160912143957/

https://deepmind.com/research/alphago/alphago-games-english/.

Majid al-Rifaie, Mohammad, and Mark Bishop. 2015. “Weak and Strong Computational Creativity.” In Computational Creativity Research: Towards Creative Machines, edited by Tarek R. Besold, Marco Schorlemmer, and Alan Smaill, 37–50. Amsterdam: Atlantis Press.

Searle, John. 1984. Mind, Brains and Science. Cambridge, MA: Harvard University Press.




Artificial Intelligence - What Is Automatic Film Editing?

  



Automatic film editing is a method of assembling full motion movies in which an algorithm, taught to obey fundamental cinematography standards, cuts and sequences footage.

Automated editing is part of a larger endeavor, known as intelligent cinematography, to include artificial intelligence into filmmaking.

Alfred Hitchcock, the legendary director, predicted that an IBM computer will one day be capable of converting a written script into a polished picture in the mid-1960s.

Many of the concepts of modern filmmaking were created by Alfred Hitchcock.

His argument that, if feasible, the size of a person or item in frame should be proportionate to their importance in the plot at that precise moment in time is one well-known rule of thumb.

"Exit left, enter right," which helps the audience follow lateral motions of actors on the screen, and the 180 and 30-degree principles for preserving spatial connections between subjects and the camera, are two more film editing precepts that arose through extensive experience by filmmakers.

Over time, these principles evolved into heuristics that regulate shot selection, editing, and rhythm and tempo.

Joseph Mascelli's Five C's of Cinematography (1965), for example, has become a large knowledge base for making judgments regarding camera angles, continuity, editing, closeups, and composition.

These human-curated guidelines and human-annotated movie stock material and snippets gave birth to the first artificial intelligence film editing systems.

IDIC, created by Warren Sack and Marc Davis at the MIT Media Lab in the early 1990s, is an example of a system from that era.

IDIC is based on Herbert Simon, J. C. Shaw, and Allen Newell's General Issue Solver, an early artificial intelligence software that was supposed to answer any general problem using the same fundamental method.

IDIC was used to create fictitious Star Trek television trailers based on a human-specified narrative plan focusing on a certain plot element.

Several film editing systems depend on idioms, or standard techniques for editing and framing recorded action in certain contexts.

The idioms themselves will differ depending on the film's style, the setting, and the action to be shown.

In this manner, experienced editors' expertise may be accessed using case-based reasoning, with prior editing recipes being used to tackle comparable present and future challenges.

Editing for combat sequences, like regular character talks, follows standard idiomatic route methods.

This is the method used by Li-wei He, Michael F. Cohen, and David H. Salesin in their Virtual Cinema tographer, which uses expert idiom knowledge in the editing of fully computer-generated video for interactive virtual environments.

He's group created the Declarative Camera Control Language (DCCL), which formalizes the control of camera locations in the editing of CGI animated films to match cinematographic traditions.

Researchers have lately begun experimenting with deep learning algorithms and training data extracted from existing collections of well-known films with good cinematographic quality to develop recommended best cuts of new films.

Many of the latest apps may be used with mobile, drone, or portable devices.

Short and interesting films constructed from pictures taken by amateurs with smartphones are projected to become a preferred medium of interaction over future social media due to easy automated video editing.

Photography is presently filling that need.

In machinima films generated with 3D virtual game engines and virtual actors, automatic film editing is also used as an editing technique.




~ Jai Krishna Ponnappan

You may also want to read more about Artificial Intelligence here.


See also: 

Workplace Automation.


Further Reading

Galvane, Quentin, Rémi Ronfard, and Marc Christie. 2015. “Comparing Film-Editing.” In Eurographics Workshop on Intelligent Cinematography and Editing, edited by William H. Bares, Marc Christie, and Rémi Ronfard, 5–12. Aire-la-Ville, Switzerland: Eurographics Association.

He, Li-wei, Michael F. Cohen, and David H. Salesin. 1996. “The Virtual Cinematographer: A Paradigm for Automatic Real-Time Camera Control and Directing.” In 

Proceedings of SIGGRAPH ’96, 217–24. New York: Association for Computing Machinery.

Ronfard, Rémi. 2012. “A Review of Film Editing Techniques for Digital Games.” In Workshop on Intelligent Cinematography and Editing. https://hal.inria.fr/hal-00694444/.

What Is Artificial General Intelligence?

Artificial General Intelligence (AGI) is defined as the software representation of generalized human cognitive capacities that enables the ...