Greater Seattle Area
6K followers 500+ connections

Join to view profile

Articles by Vikram

  • A new Alexa adventure

    In Sep 2011, I had the rare opportunity of joining the Echo/Alexa organization as the first Software Development…

    11 Comments

Activity

Join now to see all activity

Experience & Education

  • Foursquare

View Vikram’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Patents

  • Precomputed communication parameters

    Issued US 11176933

    Systems and methods for precomputed communication parameters are disclosed. A request to establish a communication channel may be received from a first device at a remote system. The remote system may query precached communication parameters associated with the first device to identify modalities and/or codecs associated with the first device. The remote system may also identify the second device to establish the communication channel with and may identify modalities and/or codecs associated…

    Systems and methods for precomputed communication parameters are disclosed. A request to establish a communication channel may be received from a first device at a remote system. The remote system may query precached communication parameters associated with the first device to identify modalities and/or codecs associated with the first device. The remote system may also identify the second device to establish the communication channel with and may identify modalities and/or codecs associated with the second device, such as by utilizing user accounts associated with the devices. A transport-address type may be identified, such as based on whether the devices are associated with the same network access point identifier and/or based on past communication channels established between the devices.

    See patent
  • Multi-assistant natural language input processing

    Issued US 11120790

    Techniques for a natural language processing (NLP) system to implement more than one assistant are described. The NLP system may receive a natural language input corresponding to more than one user command. The NLP system may respond to a first command, of the natural language input, using a TTS voice of a first NLP system assistant. The NLP system may respond to a second command, of the natural language input, using a TTS voice of a second NLP system assistant.

    See patent
  • Selection of master device for synchronized audio

    Issued US 10904665

    Synchronized output of audio on a group of devices can comprise sending audio data from an audio distribution master device to one or more slave devices in the group. Scores can be assigned to respective audio playback devices, the scores being indicative of a performance level of the respective audio playback devices acting as a master device. The device with the highest score is designated as a candidate master device and one or more remaining devices are designated as a candidate slave(s). A…

    Synchronized output of audio on a group of devices can comprise sending audio data from an audio distribution master device to one or more slave devices in the group. Scores can be assigned to respective audio playback devices, the scores being indicative of a performance level of the respective audio playback devices acting as a master device. The device with the highest score is designated as a candidate master device and one or more remaining devices are designated as a candidate slave(s). A throughput test is conducted with the highest scoring device acting as the candidate master device. The results of the throughput test are used to determine a master device for a group of devices. Latency of the throughput test can be reduced by using a prescribed time period for completion of the throughput test, and/or by selecting a first group configuration to passes the throughput test.

    See patent
  • Multi-tasking resource management

    Issued US 10880384

    Described herein is a system for allocating resources among multiple skills to enable multitasking. The system tracks use of resources using skill sessions. In one case, the system suspends a skill session to release a resource for allocation to another resource. In another case, the system determines if multiple skill sessions can remain active and use resources to provide output to the user at the same time.

    See patent
  • Multi-modality presentation and execution engine

    Issued US 10847158

    Techniques for synchronously outputting content by one or more devices are described. A system may receive a user command and may receive content responsive to the command from an application(s). The content may include various kinds of data (e.g., audio data, image data, video data, etc.). The system may also receive a presentation framework from the application, with the presentation framework indicating how content responsive to the input command should be synchronously output by one or more…

    Techniques for synchronously outputting content by one or more devices are described. A system may receive a user command and may receive content responsive to the command from an application(s). The content may include various kinds of data (e.g., audio data, image data, video data, etc.). The system may also receive a presentation framework from the application, with the presentation framework indicating how content responsive to the input command should be synchronously output by one or more devices. The system determines one or more devices proximate to the user, determines which of the one or more devices may be used to output content indicated in the presentation framework, and causes the one or more devices to output content in a synchronous manner.

    See patent
  • Implicit target selection for multiple audio playback devices in an environment

    Issued US 10839795

    A user can utter a voice command in an environment where multiple audio playback devices are located to have audio output on a single device, or a predefined group of devices in a synchronized manner. In instances when the voice command uttered by the user does not specify a target for audio output, an implicit target selection algorithm can evaluate one or more criteria to determine an appropriate target for output of the audio corresponding to the voice command. An example criterion is met if…

    A user can utter a voice command in an environment where multiple audio playback devices are located to have audio output on a single device, or a predefined group of devices in a synchronized manner. In instances when the voice command uttered by the user does not specify a target for audio output, an implicit target selection algorithm can evaluate one or more criteria to determine an appropriate target for output of the audio corresponding to the voice command. An example criterion is met if a predetermined time period has lapsed since a last utterance was detected by a device in the environment. However, other criteria can be evaluated for determining a target output device(s).

    See patent
  • Contact list reconciliation and permissioning

    Issued US 10811014

    Techniques for using validated communications identifiers of a user's communications profile to resolve entries in another user's contact list are described. When a user imports a contact list, the contact list may include multiple entities related to the same person. The system may identify one of the entries in the contact list that corresponds to a validated communications identifier stored in another user's communications profile. The system may identify other validated communications…

    Techniques for using validated communications identifiers of a user's communications profile to resolve entries in another user's contact list are described. When a user imports a contact list, the contact list may include multiple entities related to the same person. The system may identify one of the entries in the contact list that corresponds to a validated communications identifier stored in another user's communications profile. The system may identify other validated communications identifiers in the other user's communications profile and cross-reference them against the entries of the contact list. If the system determines the contact list includes entries for the different validated communications identifiers of the other user, the system may consolidate the entries into a single entry associated with the other user.

    See patent
  • Communication account contact ingestion and aggregation

    Issued US 10715470

    Techniques for detecting spam accounts in a system are described. When a system creates a user profile, the system may ingest a blocked communications list. The system may determine how many times each blocked communications number represented in the ingested blocked communications list is included in blocked communications lists of various users of the system. If a blocked communications number represented in the ingested blocked communications list is included in at least a threshold number…

    Techniques for detecting spam accounts in a system are described. When a system creates a user profile, the system may ingest a blocked communications list. The system may determine how many times each blocked communications number represented in the ingested blocked communications list is included in blocked communications lists of various users of the system. If a blocked communications number represented in the ingested blocked communications list is included in at least a threshold number of other blocked communications lists, the system may mark the communications number as spam at a system level and engage in appropriate mitigation techniques (e.g., throttle the phone numbers activity, disable the phone number's ability to communicate with system devices, etc.).

    See patent
  • Context driven device arbitration

    Issued US 10482904

    This disclosure describes, in part, context-driven device arbitration techniques to select a speech interface device from multiple speech interface devices to provide a response to a command included in a speech utterance of a user. In some examples, the context-driven arbitration techniques may include executing multiple pipeline instances to analyze audio signals and device metadata received from each of the multiple speech interface devices which detected the speech utterance. A remote…

    This disclosure describes, in part, context-driven device arbitration techniques to select a speech interface device from multiple speech interface devices to provide a response to a command included in a speech utterance of a user. In some examples, the context-driven arbitration techniques may include executing multiple pipeline instances to analyze audio signals and device metadata received from each of the multiple speech interface devices which detected the speech utterance. A remote speech processing service may execute the multiple pipeline instances and analyze the audio signals and/or metadata, at various stages of the pipeline instances, to determine which speech interface device is to respond to the speech utterance.

    See patent
  • Audio playback device that dynamically switches between receiving audio data from a soft access point and receiving audio data from a local access point

    Issued US 10431217

    Synchronized output of audio on a group of devices comprises sending audio data from an audio distribution master device to one or more slave devices in the group. In group mode, a slave can be configured to receive audio data directly from a master device acting as a soft wireless access point (WAP) in an environment that includes a traditional WAP. In response to a user request to output audio via the slave in individual mode, the slave may be configured to dynamically switch to receiving…

    Synchronized output of audio on a group of devices comprises sending audio data from an audio distribution master device to one or more slave devices in the group. In group mode, a slave can be configured to receive audio data directly from a master device acting as a soft wireless access point (WAP) in an environment that includes a traditional WAP. In response to a user request to output audio via the slave in individual mode, the slave may be configured to dynamically switch to receiving audio data via the WAP in the environment without routing the audio data through the master device acting as the soft WAP. This dynamic switching to receiving audio data via the WAP in individual mode can reduce bandwidth consumption on the master device.

    See patent
  • Outputting notifications using device groups

    Issued US 10425781

    A system that determines that devices are co-located in an acoustic region and selects a single device to which to send incoming notifications for the acoustic region. The system may group devices into separate acoustic regions based on selection data that selects between similar audio data received from multiple devices. The system may select the best device for each acoustic region based on a frequency that the device was selected previously, input/output capabilities of the device, a…

    A system that determines that devices are co-located in an acoustic region and selects a single device to which to send incoming notifications for the acoustic region. The system may group devices into separate acoustic regions based on selection data that selects between similar audio data received from multiple devices. The system may select the best device for each acoustic region based on a frequency that the device was selected previously, input/output capabilities of the device, a proximity to a user, or the like. The system may send a notification to a single device in each of the acoustic regions so that a user receives a single notification instead of multiple unsynchronized notifications. The system may also determine that acoustic regions are associated with different locations and select acoustic regions to which to send a notification based on location.

    See patent
  • Application discovery and selection in language-based systems

    Issued US 10249296

    A language-based system may be configured to interact with a user by understanding natural language of the user and may provide functions and services in response to such natural language. Certain functions and services may be provided by third-party applications that register serviceable intents with the language-based system. A serviceable intent indicates an intent that the third-party application is able to fulfill or service. Upon determining an intent of the user based on natural language…

    A language-based system may be configured to interact with a user by understanding natural language of the user and may provide functions and services in response to such natural language. Certain functions and services may be provided by third-party applications that register serviceable intents with the language-based system. A serviceable intent indicates an intent that the third-party application is able to fulfill or service. Upon determining an intent of the user based on natural language interaction with the user, the system identifies one of the third-party applications that has specified a matching serviceable intent and selects that application for use by the user.

    See patent
  • Contingent device actions during loss of network connectivity

    Issued US 10224056

    A speech-based system includes a local device in a user premises and a network-based control service that directs the local device to perform actions for a user. The control service may specify a first action that is to be performed upon detection by the local device of a stimulus. In some cases, performing the first action may rely on the availability of network communications with the control service or with another service. In these cases, the control service also specifies a second…

    A speech-based system includes a local device in a user premises and a network-based control service that directs the local device to perform actions for a user. The control service may specify a first action that is to be performed upon detection by the local device of a stimulus. In some cases, performing the first action may rely on the availability of network communications with the control service or with another service. In these cases, the control service also specifies a second, fallback action that does not rely upon network communications. Upon detecting the stimulus, the local device performs the first action if network communications are available. If network communications are not available, the local device performs the second, fallback action.

    See patent
  • Attribute-based audio channel arbitration

    Issued US 10055190

    A speech-based system includes a local device in a user premises and a remote service that uses the local device to conduct speech dialogs with a user. The local device may also be directed to play audio such as music, audio books, etc. When designating audio for playing by the local device, the remote service may specify that the audio is either background audio or foreground audio. For background audio, the service indicates whether the background audio is mixable. For foreground audio, the…

    A speech-based system includes a local device in a user premises and a remote service that uses the local device to conduct speech dialogs with a user. The local device may also be directed to play audio such as music, audio books, etc. When designating audio for playing by the local device, the remote service may specify that the audio is either background audio or foreground audio. For background audio, the service indicates whether the background audio is mixable. For foreground audio, the service indicates an interrupt behavior. When the local device is playing background audio and receives foreground audio, the background audio is paused, attenuated, or not changed based on the indicated interrupt behavior of the foreground audio and whether the background audio has been designated as being mixable.

    See patent
  • Managing dialogs on a speech recognition platform

    Issued US 10026394

    A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform a corresponding action…

    A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform a corresponding action, such as streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user. In some instances, the speech recognition platform engages in a back-and-forth dialog with the user in order to properly fulfill the user's request.

    See patent
  • Rule-based presentation of media items

    Issued US 9,996,148

    Features are disclosed for presenting multiple media items based on one or more rules defining how the items are to be presented. One media item may be presented, and during presentation any number of additional media items may be received or scheduled for presentation. Rules may define which media items have priority over others, which media items may interrupt others or be interrupted, which media items may be delayed or presented early, whether particular media items are time-critical such…

    Features are disclosed for presenting multiple media items based on one or more rules defining how the items are to be presented. One media item may be presented, and during presentation any number of additional media items may be received or scheduled for presentation. Rules may define which media items have priority over others, which media items may interrupt others or be interrupted, which media items may be delayed or presented early, whether particular media items are time-critical such that they are not to be delayed but rather should take presentation priority over others, etc. Metadata may be associated with particular media items or categories thereof. The metadata can provide details regarding how the rules should be applied to those media items. User feedback may also be obtained, and may affect the further application of the rules.

    See patent
  • Voice interaction application selection

    Issued US 9,741,343

    An open framework for computing devices to dispatch voice-based interactions to supporting applications. Applications are selected on a trial-and-error basis to find an application able to handle the voice interaction. Dispatching to the applications may be performed without a determination of meaning conveyed in the interaction, with meaning determined by the individual applications. Once an application acts upon a voice interaction, that application may be given first-right-of-refusal for…

    An open framework for computing devices to dispatch voice-based interactions to supporting applications. Applications are selected on a trial-and-error basis to find an application able to handle the voice interaction. Dispatching to the applications may be performed without a determination of meaning conveyed in the interaction, with meaning determined by the individual applications. Once an application acts upon a voice interaction, that application may be given first-right-of-refusal for subsequent voice interactions.

    See patent
  • Load-balanced, persistent connection techniques

    Issued US 9712625

    Techniques for creating a persistent connection between client devices and one or more remote computing resources, which may form a portion of a network-accessible computing platform. This connection may be considered "permanent" or "nearly permanent" to allow the client device to both send data to and receive data from the remote resources at nearly any time. In addition, both the client device and the remote resources may establish virtual channels over this single connection. If no data is…

    Techniques for creating a persistent connection between client devices and one or more remote computing resources, which may form a portion of a network-accessible computing platform. This connection may be considered "permanent" or "nearly permanent" to allow the client device to both send data to and receive data from the remote resources at nearly any time. In addition, both the client device and the remote resources may establish virtual channels over this single connection. If no data is exchanged between the client device and the remote computing resources for a threshold amount of time, then the connection may be severed and the client device may attempt to establish a new connection with the remote computing resources.

    See patent
  • Third party audio announcements

    Issued US 9692742

    A system enables end user devices to receive audio announcements from third party cloud-based resources. For example, the system may include a first party cloud-based resource providing tokens to the third party cloud-based resource in order to prevent the third party cloud-based resource from causing audio announcements to be output by user devices without authorization. In some cases, the tokens may be time based and prevent the third party cloud-based resource from causing audio…

    A system enables end user devices to receive audio announcements from third party cloud-based resources. For example, the system may include a first party cloud-based resource providing tokens to the third party cloud-based resource in order to prevent the third party cloud-based resource from causing audio announcements to be output by user devices without authorization. In some cases, the tokens may be time based and prevent the third party cloud-based resource from causing audio announcements to be output by user devices after a predefined amount of time. In other examples, the tokens may be use based and prevent the third party cloud-based resource from causing the user device to output more than a predetermined number of audio announcements.

    See patent
  • Application focus in speech-based systems

    Issued US 9,552,816

    A speech-based system includes an audio device in a user premises and a network-based service that supports use of the audio device by multiple applications. The audio device may be directed to play audio content such as music, audio books, etc. The audio device may also be directed to interact with a user through speech. The network-based service monitors event messages received from the audio device to determine which of the multiple applications currently has speech focus. When receiving…

    A speech-based system includes an audio device in a user premises and a network-based service that supports use of the audio device by multiple applications. The audio device may be directed to play audio content such as music, audio books, etc. The audio device may also be directed to interact with a user through speech. The network-based service monitors event messages received from the audio device to determine which of the multiple applications currently has speech focus. When receiving speech from a user, the service first offers the corresponding meaning to the application, if any, that currently has primary speech focus. If there is no application that currently has primary speech focus, or if the application having primary speech focus is not able to respond to the meaning, the service then offers the user meaning to the application that currently has secondary speech focus.

    See patent
  • Speech recognition platforms

    Issued US 9,299,346

    A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform a corresponding action…

    A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform a corresponding action, such as streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user. The speech recognition platform, in combination with the device, may therefore facilitate efficient interactions between the user and a voice-controlled device.

    See patent
  • Storing state information from network-based user devices

    Issued US 9,293,138

    Network-based services may be provided to a user through the user of a speech-based user device located within a user environment. The speech-based user device may accept speech commands from a user and may also interact with the user by means of generated speech. Operating state of the speech-based user device may be provided to the network-based service and stored by the service. Applications that provide services through the speech-based interface may request and obtain the stored state…

    Network-based services may be provided to a user through the user of a speech-based user device located within a user environment. The speech-based user device may accept speech commands from a user and may also interact with the user by means of generated speech. Operating state of the speech-based user device may be provided to the network-based service and stored by the service. Applications that provide services through the speech-based interface may request and obtain the stored state information.

    See patent
  • User perceived gapless playback

    Issued US 9,282,403

    A computing system for selecting and providing content items to a device. The device is configured to output a first content item to a user and to detect events related to the output of the first content item and, in response, to provide a notification to a cloud service related to the event. The device is further configured to receive at least a second content item from the cloud services and to buffer the second content item while outputting the first content item.

    See patent
  • Identification of utterance subjects

    Issued US 8,977,555

    Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech ("TTS") presentation. The…

    Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech ("TTS") presentation. The markers may then be provided to a speech processing system along with a user utterance captured by the client device. The speech processing system, which may include automatic speech recognition ("ASR") modules and/or natural language understanding ("NLU") modules, can generate hints based on the marker. The hints can be provided to the ASR and/or NLU modules in order to aid in processing the meaning or intent of a user utterance.

    See patent

Recommendations received

2 people have recommended Vikram

Join now to view

More activity by Vikram

View Vikram’s full profile

  • See who you know in common
  • Get introduced
  • Contact Vikram directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Others named Vikram Gundeti

Add new skills with these courses